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Modular. 

Integrated. 

Now. 


Handle Writer/Spell™ 

Word processing with integrated 
spelling correction and verification. 


Handle Calc™ 

Spreadsheet with up to 32,000 
rows and columns. Conditional 
and iterative recalculation. 


The Handle Office-Automation Series is a powerful set of modular, 
integrated software tools developed for today's multiuser office 
environment. Handle application modules can be used stand-alone 
or combined into a fully integrated system. 

The Handle Office-Automation Series modules offer: 

• Ease of Use and Learning 

• Insulation from UNIX 

• Data Sharing Between Multiple Users 

• Data Integration Between Modules 

• Data Sharing with Other Software Products 

• Sophisticated Document Security System 


Handle Technologies, Inc. 


Corporate Office 

6300 Richmond 
3rd Floor 

Houston, TX 77057 
(713)266-1415 


Sales and Product Information 

850 North Lake Tahoe Blvd. 
P.O. Box 1913 
Tahoe City, CA 95730 
(916) 583-7283 


TM-HANDLE, HANDLE HOST, HANDLE WRITER, HANDLE SPELL HANDLE WRITER/SPELL and HANDLE CALC ARE TRADEMARKS OF HANDLE TECHNOLOGIES, INC. 
TM-UNIX IS A TRADEMARK OF AT&T BELL LABORATORIES. 
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file manager. And C-ISAM? the de facto 
standard ISAM for UNIX. It’s built into all 
our products, but you can buy it separately. 

And when you choose RDS, youll be 
in the company of some other good com¬ 
panies. Computer manufacturers including 
AT&T, Northern Telecom, Altos and over 
60 others. And major corporations like 
Anheuser Busch, First National Bank of 
Chicago and Pacific Bell. 

Which makes sense. After all, only RDS 
offers a family of products that work so well 
together. As well as with so many industry 
standards. 


So call us for a demo, a manual and a 
copy of our Independent Software Vendor 
Catalog. Software vendors be sure to ask 
about our new “Hooks” software integration 
program. Our number: 415/424-1300. 

Or write RDS, 2471 East Bayshore Road, 
Palo Alto, CA 94303. 

And we’ll show you how we took a good 
idea and made it better. 



RELATIONAL DATABASE SYSTEMS, INC. 
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DATABASE INTRICACIES 


42 INTERVIEW WITH PETER 
WEINBERGER 


By Ned Peirce 58 

One of Bell Labs' foremost database 


TRANSACTION 

PROCESSING 




gurus calls them as he sees them. 


BACKEND DATABASE 
MACHINES 

By Paula Hawthorn 

The whys and wherefores of database 
work on dedicated machines. 


By Kathryn Anderson 

Not only is TP possible under UNIX— 
it can actually be done without kernel 
modifications. 
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VIEWPOINT 


Getting down to business 


People who endorse UNIX as a 
business solution have learned to 
expect snickers and raised eye¬ 
brows for their trouble. It’s not 
difficult to see why. 

Despite a surge in applications 
development over the past 18 
months, UNIX is still commonly 
perceived as the exclusive play¬ 
ground of the technically sophis¬ 
ticated. Databases somehow do 
not fit into that picture. 

The fact of the matter, though, 
is that database management is 
possible in the UNIX environ¬ 
ment. A large body of software— 
over 50 UNIX DBMS products by a 
recent count—testifies to that. 
Nevertheless, reservations per¬ 
sist for good reason. 

First, there is the hierarchical 
file structure of UNIX. Despite its 
elegance and overall functional¬ 
ity, it fails, to meet the specific 
needs of relational database soft¬ 
ware. To compound matters, 
UNIX lacks standardized file and 
record locking. 

All is not lost, however. Clever 
software developers have man¬ 
aged to work around the oper¬ 
ating system’s limitations to 
produce DBMS software that suf¬ 
fers neither for lack of functional¬ 
ity nor performance. 

This issue of UNIX REVIEW 
explores some of the strategies 
employed by those developers. 
Eric Allman lays out the chal¬ 
lenge in the introductory article, 
which he begins with a discus¬ 
sion of the evolution of data 
management and concludes with 
a survey of desirable database 
features. 

Roger Sippl, president of Rela¬ 
tional Database Systems Inc., 
follows with a piece that details 
how UNIX succeeds and fails in 
meeting the needs of data man¬ 
agement software. A few modest 


proposals are interspersed. 

One of the strategies Roger 
introduces is the use of a backend 
machine. Paula Hawthorn ex¬ 
pands on this notion in the article 
that follows. Using Britton Lee’s 
Intelligent Database Machine as 
an example, she discusses how 
the use of a separate dedicated 
processor for database work 
can improve DBMS performance, 
even while it frees up CPU cycles 
for other tasks. 

Another piece, authored by 
Kathryn Anderson, probes strat¬ 
egies that can be used to facilitate 
transaction processing. Although 
TP is purportedly a bugaboo topic 
in UNIXland, Kathryn tells how 
System V can support robust 
transaction applications—with¬ 
out resorting to kernel modifica¬ 
tions. She focuses largely on the 
topics of control and tunability. 

Ned Peirce closes out the theme 
with an interview of Peter Wein¬ 
berger, the head of Computer 
Science Research at AT&T Bell 
Labs. Peter’s database work 
under UNIX has become well 
known in development circles 
and his comments on the suitabil¬ 
ity of UNIX for database work 
carry the tone of authority. 

As you may have guessed, all 
voices in the issue support UNIX 
as a database environment, al¬ 
though it’s acknowledged that 
some adjustments are necessary. 
This should be heartening to 
those who wish to see UNIX 
succeed in the business commu¬ 
nity, because database software 
will be the foundation upon 
which much of tomorrow’s busi¬ 
ness applications will be built. 
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How to go 
from 

UNIXtoDOS 

without. 
compromising 

your 

standards. 

It’s easy. Just get an industry standard file 
access method that works on both. 

C-ISAM™ from RDS. 

It’s been the UNIX” standard for years 
(used in more UNIX languages and programs 
than any other access method), and it’s fast 
becoming the standard for DOS. 

Why? 

Because of the way it works. Its B+ Tree 
indexing structure offers unlimited indexes. 

There’s also automatic or manual record 
locking and optional transaction audit 
trails. Plus index compression to save disk 
space and cut access times. 

<0 1985, Relational Database Systems, Inc. UNIX is a trademark of AT&T Bell laboratories. 

INFORMIX is a registered trademark and RDS, C-ISAM and File-It! are trademarks of 
Relational Database Systems, Inc. 


How can we be so sure C-ISAM works 
so well? We use it ourselves. It’s a part 
of INFORMIX: INFORMIX-SQL and File-itl? 
our best selling database management 
programs. 

For an information packet, call (415) 
424-1300. Or write RDS, 2471 East Bayshore 
Road, Palo Alto, CA 94303. 

You’ll see why anything less than C-ISAM 
is just a compromise. 

RELATIONAL DATABASE SYSTEMS, INC. 
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How we 


as part of the program, you can ask more of 
your database. Using the emerging industry- 
• 1 standard query language. 

improved Structured Tomakeyouriob 


Queiy Language. 


Actually, we didn’t change a thing. 

We just combined it with the best 
relational database management system. 
Introducing INFORMIX-SQL. 

It runs on either UNIX” or MS”-DOS 


easier, INFORMIX-SQL 
comes with the most complete 
set of application building 
tools. Including a full report 
writer and screen generator. Plus a family 
of companion products that all work 
together. 

Like our embedded SQLs for C and 
COBOL. So you can easily link your pro- 


operating systems. And now with IBM’s SQL grams with ours. File-it!” our easy-to-use 


~— 1 — ~ 

_ 


INFORMIX is a registered trademark and RDS, C-1SAM and File-it! are trademarks of Relational Database Systems, Inc. IBM, UNIX and MS are trademarks oflntemational Business Machines Corporation, 
AT&T Bell Laboratories and Microsoft, respectively. © 1985, Relational Database Systems, Inc. 





















The First Name In 
Integrated Office 
Automation Software 



Executive Mail 

Telephone 

Directory 


Menu Processor 
Word Processor 
Forms/Data Base 
Spreadsheet 


Certified and 
Deliverable Since 1981 


XED was the first independent software 
company to introduce a Unix WP package 
and achieved early success by selling to 
the government and international market 
(XED is the only Unix WP package to meet 
government specifications). Worldwide 
sales of XED rank Computer Methods first 
in both sales and units installed in 1984. 






INTEGRATED OFFICE S O F TWARE _ 

Box 3938 • Chatsworth, CA 91313 U.S.A. • (818) 884-2000 
FAX (818) 884-3870 • Inti. TLX 292 662 XED UR 


XED is a registered trademark of CCL Datentechnik AG 
UNIX is a trademark of AT & T Bell Laboratories, Inc. 
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THE MONTHLY 
REPORT 


Pyramid power 

by Roger Strukhoff 


One of the more impressive 
machines making its debut at the 
National Computer Conference in 
Chicago July 16 was Pyramid 
Technology Corp.’s 98x, a super¬ 
minicomputer built with the idea 
of equal opportunity for equal 
microprocessors. Pyramid calls 
the 98x an “isoprocessor” sys¬ 
tem; there are two 32-bit RISC 
(reduced instruction set com¬ 
puter) processors working as 
equal partners. If one processor 
fails, the system continues to run 
at about 60 percent of capacity. 

The 98x supports up to 98 
users, comes with as much as 16 
MB of main memory, and offers 
disk drives with nearly a gigabyte 
of capacity (“only” 940 MB actu¬ 
ally). The machine uses Pyra¬ 
mid’s OSx operating system, a 
dual port (or “dualPort”, in Pyra¬ 
mid parlance) of UNIX, incorpo¬ 
rating both 4.2BSD and System 
V. 

Charles Krahling, Pyramid’s 
director of marketing, says his 
company is fully aware of the 
traditionally weak I/O of boxes 
based on UNIX, and so has devel¬ 
oped a new I/O subsystem for 
the 98x. “I/O (in any system) 
can always use improvement,” 
Krahling noted. “Plus, with the 
powerful disks we see coming out 
in the near future, we needed to 
develop a stronger I/O system for 
our machines.” The subsystem 
has what Pyramid calls an Intelli¬ 



gent I/O Processor (IOP), with an 
aggregate throughput capability 
(PTAL bandwidth) of 11 MB per 
second. The IOP also provides 
disk rotational sensing and over¬ 
lapped seeks. 

The idea behind isoprocessing 
comes from Purdue’s dual pro¬ 
cessor implementation, Krahling 
said. CPU balance is achieved 
through the use of a proprietary 
semaphore system that protects 
critical sequences of code and 
controls simultaneous access 
to kernel data structures. The 
semaphore is designed to comple¬ 
ment standard UNIX process syn¬ 
chronization concepts, allowing 
symmetric multiprocessor sup¬ 
port without the need to imple¬ 
ment major structural changes to 
the UNIX kernel. Normal UNIX 
organization is maintained, but 
both CPUs share a single copy of 


the OSx kernel and data struc¬ 
tures, and have equal access to all 
shared resources. 

The isoprocessors are imple¬ 
mented in a fast Shottky TTL, 
and have a cycle time of only 100 
nanoseconds. Each isoprocessor 
has an instruction unit (I-unit), 
an execution unit (E-unit), a 32K 
byte data cache, and a microcode 
sequencer. The I-units prefetch 
instructions from a 4K byte in¬ 
struction cache and take oper¬ 
ands from register stacks or im¬ 
mediate fields. The pipelining 
architecture of the 98x overlaps I- 
unit and E-unit activity. 

RISC technology is, of course, 
key to the system’s performance. 
Each CPU uses 528 registers of 32 
bits each, implemented in stack 
form with 16 levels of 32 regis¬ 
ters, plus 16 global registers. The 
register stack allows parameters 
to be passed between stack levels 
without data being moved. 

The 98x will be shipped in 
October, according to Krahling. 
Prices vary between $260,000 
and $500,000. Field upgrades 
for 90x users will cost about 
$90,000. Pyramid also has an 
extended system in the works, the 
98xE, which will add an I/O 
expansion bay to support a maxi¬ 
mum of 256 users. 

Pyramid began shipping com¬ 
puter systems in October 1983. It 
had gross sales of $12 million in 
1984, and is anticipating about 
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COHERENT™ IS SUPERIOR TO UNIX* 
AND IT’S AVAILABLE TODAY 
ON THE IBM PC. 


Mark Williams Company hasn’t just taken a mini-computer 
operating system, like UNIX, and ported it to the PC. We 
wrote COHERENT ourselves. We were able to bring UNIX 
capability to the PC with the PC in mind, making it the most 
efficient personal computer work station available at an 
unbelievable price. 

For the first time you get a multi-user, multitasking operating 
system on your IBM PC. Because COHERENT is UNIX- 
compatible, UNIX software will run on the PC under 
COHERENT. 

The software system includes a C-compiler and over 100 utili¬ 
ties, all for $500. Similar environments cost thousands more. 

COHERENT on the IBM PC requires a hard disk and 256K 
memory. It’s available on the IBM XT, and Tecmar, Davong 
and Corvus hard disks. 

Available now. For additional information, call or write, 

Mark Williams Company 

1430 West Wrightwood, Chicago, Illinois 60614 

312/472-6659 



Mark 

Williams 

Company 


COHERENT is a trademark of Mark Williams Company. 
•UNIX is as trademark of Bell Laboratories. 
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VMS. WHEN THE ONLY THING YOUR 
USERS WANT IS EVERYTHING, NOW. 


If "I want it now” seems like 
the only thing you ever hear from 
your users - and if your applica¬ 
tions backlog is telling you that 
“now” is going to be a long way 
off - you need to know about our 
VMS™ Virtual Memory System 
software. 

Our VMS software, which 
was designed exclusively with 
VAX™ computer systems, is far 
more than a mere operating sys¬ 
tem. It is a complete operating 
environment. One that can en¬ 
compass all the ways you use 
computers. To get it done, now. 

The VMS operating environ¬ 
ment provides you with the in¬ 
dustry’s most complete set of 
utilities and software products 
for program development and 
system management. Interac¬ 
tive, realtime, even background 
batch applications can be devel¬ 
oped with ease and speed. 

Everything you need for 
corporate-level applications is 
included. Advanced office 
automation software. Proven 
transaction processing capabili¬ 
ties. Comprehensive CODASYL- 
compliant and relational data¬ 
base managers. User-friendly 
query tools and an integrated 
data dictionary. Plus the indus¬ 
try’s most complete set of local 
and wide area networking facili¬ 
ties, allowing you to exchange 
data files and cooperate with 
systems made by IBM - via 
SNA gateway - as well as many 
other vendors. 

I (MS PUTS YOUR 
PROGRAMMERS IN 
THE FAST LANE. 

You’ve heard a lot of prom¬ 
ises about increasing program¬ 
mer productivity over the years. 
After all, the application backlog 
in most companies has reached 
18 months to 2 years. And the 
demand for new applications is 
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increasing by some 50 percent 
a year. So there’s hardly a more 
important subject. 

Digital’s VMS software does 
more to increase programmer 
productivity than any other sys¬ 
tem can. Because it combines 
everything you need for applica¬ 
tion design, development and 
maintenance into a single, inte¬ 
grated operating environment. 

You’ll have the industry’s best 
symbolic debugger to work with. 
A command language so flexi¬ 
ble it lets you stop application 
execution and enter new com¬ 
mands at any point, and even 
develop your own commands 
through its macro facility. A com¬ 
mon runtime environment that 
lets you reuse code instead of 
rewriting it. And more advanced 
languages than any other sys¬ 
tem - some 16 in all, including 
Ada®, C, COBOL, FORTRAN, 
and Lisp - all of which you can 
combine in a single program 
through the common calling 
standard. 

This gives you a distinct ad¬ 
vantage in team development 
projects. Each developer can 


use the language best suited to 
his or her talents and to the task. 
Large, complex projects such 
as transaction processing appli¬ 
cations can be designed, devel¬ 
oped, maintained and managed 
easily. 

The end result - better code 
in less time. And satisfied, pro¬ 
ductive users at every level, from 
the factory floor to the executive 
suite. 

The point is, many other sys¬ 
tems promise you ease of use, 
up to a certain point. The VMS 
environment simplifies devel¬ 
opment not only for small, ad 
hoc projects - but also when 
your applications reach highly 
sophisticated, complex levels. 

WITH VMS, FLEXIBILITY 
IS BUILT IN. _ 

You get another big advan¬ 
tage when you develop your 
applications with the VMS oper¬ 
ating environment: the range 
of hardware you can run them 
on. Namely, our VAX computer 
family, the industry standard for 
32-bit computing. 

The VMS environment spans 


the entire VAX systems family, 
from the smallest MicroVAX™ 
system to the largest VAX 8600™ 
and multiprocessor VAXcluster™ 
systems. This is the biggest soft¬ 
ware-compatible growth path 
in the world. And with the data 
security mechanisms built into 
VMS software, increased access 
doesn't mean compromised 
security. 

If you need UNIX® software 
capabilities, the VMS operating 
environment can readily provide 
them through the VNX™ option. 
And if you need the convenience 
of applications packages, you’ll 
have over 2,000 to choose from 
- created for VMS software by 
Digital and independent ven¬ 
dors. Your choices are never lim¬ 
ited when you start with the VMS 
environment. 

MSf ENGINEERED 
MEANS ENGINEERED 
TO A PLAN. _ 

Digital’s VMS operating envi¬ 
ronment, like all Digital hardware 
and software products, is engi¬ 
neered to conform to an overall 
computing strategy. This means 
that our systems are engineered 
to work together easily and ex¬ 
pand economically. Only Digital 
provides you with a single, inte¬ 
grated computing strategy di¬ 
rect from desktop to data center. 

For more information about 
how our VMS operating envi¬ 
ronment can help you cut your 
applications backlog, contact 
your local Digital sales represen¬ 
tative. Or call 1-800-DIGITAL, 
ext. 219. 


THE BEST ENGINEERED 
COMPUTERS 
IN THE WORLD. 
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The 98x supermini from Pyramid 
Technology Corporation. 


$35 million in 1985, predicated 
on the success of the 98x. Pyra¬ 
mid’s first system was the 90x, a 
32-bit supermini that Krahling 
claimed was the first RISC ma¬ 
chine. The 90x has 2.5 times the 
power of a VAX 780; a multi¬ 
processor system, the 90Mx, re¬ 
portedly provides 4.2 times VAX 
780 power (and doesn’t need to be 
stored in silos). But the 90Mx has 
a master-slave multiprocessor 
environment. 

Krahling says there are 5000 
VAXen on order, with about 20 
percent of that total (1000) run¬ 
ning UNIX. Pyramid thinks of 
itself as the number two supplier 
in superminicomputers, and will 
certainly try harder to gain as 
much of that 1000-system DEC 
market as it can. It plans to sell 80 
percent through its own sales 
force, according to Krahling. 

AT&T CONTINUES 
SYSTEM V PUSH 

AT&T continued its aggressive 
efforts to promote System V as the 
“UNIX system for business’’ by 
showing off its Remote File Sys¬ 
tem (RFS) at the Usenix Confer¬ 
ence in Portland. Demonstrated 
on 3B2 hardware, the system was 
described as “Streams-based”, 


and capable of providing “proto¬ 
col and media independence’’ 
within System V. 

The Remote File System pro¬ 
vides transparent access to direc¬ 
tories, files, special devices, and 
named pipes. An administrator 
may select directories in the local 
file tree to advertise to remote 
machines (by using the adv com¬ 
mand). Other users can then use 
the mount command to mount 
advertised directories at any 
place in their local file tree (see 
Figure 1). Its name server fea¬ 
tures machine-independent re¬ 
source naming using a hierarchi¬ 
cal domain -naming convention. 
Its recovery mechanism ensures 
that no single machine failure 
will bring down the network. 

Administrative support will in¬ 
clude: selective resource sharing 


The Remote File System 
provides transparent 
access to directories, 
files, special devices, 
and named pipes. 


that allows administrators to ad¬ 
vertise any directory for read/ 
write or read-only access, and to 
provide lists of machines autho¬ 
rized to mount specific directo¬ 
ries; machine authentication, 
allowing administrators to deter¬ 
mine passwords to aid in identifi¬ 
cation of systems requesting re¬ 
mote mounts; user and group ID 
mapping (providing such options 
as the ability to map all remote 
IDs to a single remote ID, thus 
maintaining identical IDs across 
a group of machines—with se¬ 
lected exclusions—and the use of 
explicit mapping tables); and a 


monitor that allows report gen¬ 
eration to separate remote from 
local activity. 

The rumor is that AT&T will 
release the Remote File System 
on or before the Anaheim Uni- 
Forum show in February 1986. 

A company already in the dis¬ 
tributed file system arena, Sun 
Microsystems, is undaunted by 
the recent AT&T developments. 
Bill Keating, marketing manager 
for Sun’s Network File System 
(NFS), first pointed out that “this 
is not the first time (AT&T has) 
talked about (their remote file 
system). Who knows when it will 
really be available?” Keating 
pointed out that Sun’s NFS can be 
implemented with non-UNIX sys¬ 
tems as well. “Our intent is to 
provide a system that allows 
interconnection among heteroge¬ 
neous computers and operating 
systems. We’re after DEC, and 
we’re after the IBM PC, just to 
name a couple.” He referred to 
NFS as “a superset” of distribut¬ 
ed file systems because of its 
ability to be implemented across 
the computer spectrum. 

But design flexibility is not as 
important as third-party support 
these days, particularly when the 
competition comes from a com¬ 
pany the size of AT&T. Keating 
conceded that, saying that Sun 
“will become compatible with 
whatever AT&T offers,” but he 
also said Sun “is trying to make 
NFS a standard (in its own right) 
by getting other companies to 
adopt it.” He said Celerity Com¬ 
puting has been added to the list 
of NFS supporters—a list that 
already included Pyramid, Gould, 
and Mt. XINU. The latter an¬ 
nounced a few months ago that it 
would implement NFS on VAXen 
running 4.2BSD. 

For its part, AT&T claims not to 
be in direct competition with Sun. 
Spokesperson Lawrence Brown 
said, “(NFS and the Remote File 
System) are really targeted at 
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Figure 1 — Diagram representing the transparent access capability of 
AT&T's Remote File System. 


different markets.” Brown thinks 
Sun is more interested in “the 
engineering workstation” than 
the multiuser business office sys¬ 
tem stressed by AT&T—a natural 
assumption given Sun’s commit¬ 
ment to 4.2BSD. Brown praised 
NFS as a “well-designed system”. 
But he pointed out that the Re¬ 
mote File System will not “sacri¬ 
fice UNIX functionality,” as is the 
case with NFS. 

THE REAL WORLD LOOMS 

There was a scene in the movie 
Woodstock where well-known 
Hog Farmer and Berkeley-San 
Francisco cultural icon Wavy 
Gravy implored those who “don’t 
think capitalism is too weird” to 
buy some hot dogs from a poor 
soul whose tube steak stand had 
burned to the ground. Certainly, a 
similar feeling was in the air as 
hackers mingled in Portland at 
what is becoming an increasingly 
commercial Usenix Conference. 

The dreaded “C-words”, cap¬ 
italism and commercialism, are 
inexorably wending their way 


The dreaded "C- 
words", capitalism and 
commercialism, are 
inexorably wending 
their way into the 
UNIX community. 


into the UNIX community. One 
needed look no further than the 
AT&T army swarming about its 
dominant booth in the center of 
the exhibition. Even some of the 
technical sessions were more like 
“the gospel according to (enter 
company name)” than objective 
discourses. To be sure, some of 
the sessions were still of the old 
school. There was substance and 
humor enough for those who were 
persistent enough to ferret it out. 

But there’s big bucks to be had 


in UNIX systems, and like Wavy 
Gravy, there’s more than a couple 
of folks who think that latching 
onto a few of those bucks might 
not be so weird after all. 

CRAY-2 RUNS SYSTEM V 

Having $17.6 million doesn’t 
make you as rich as it used to, but 
it’s enough to make you the proud 
owner of a Cray-2 supercom¬ 
puter, the new UNIX-based mon¬ 
ster from Cray Research, Inc. The 
Cray-2 is designed to deliver as 
much as 12 times the perfor¬ 
mance of the Cray 1 and is the 
first (and only) system of this size 
to run System V. Previously, Cray 
ran its own operating system, 
COS, on its systems. (The com¬ 
pany continues to support COS, 
but it’s also reportedly planning 
to offer UNIX on its XMP com¬ 
puter sometime next year.) 

The Cray-2 can consume just 
about anything for breakfast. Its 
clock cycle is 4.1 nanoseconds, 
and main memory takes in 256 
million 64-bit words, or about 
1.6 gigabytes. Main memory me¬ 
gabytes are divided into four 
quadrants with 128 interleaved 
banks. The CPU has one fore¬ 
ground and four background pro¬ 
cessors. The four background 
processors, each said to be more 
powerful than the Cray-1 CPU, 
perform scalar and vector calcu¬ 
lations, and can operate either 
independently or jointly. The 
CPU has 320 plug-in modules, 
each holding 750 integrated cir¬ 
cuit packages. There are about 
240,000 chips in each module. 

To keep all this horsepower 
cool, a liquid-immersion, fluoro¬ 
carbon-based cooling system 
is used. The inert fluid is circu¬ 
lated throughout the CPU cabi¬ 
net, where it comes into direct 
contact with the integrated cir¬ 
cuit packages. 


Roger Strukhoff is the Associate 
Editor of UNIX REVIEW. ■ 
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THE HUMAN 
FACTOR 


Prototyping a memorandum database 

by Richard Morin 


Paper is miserable stuff. It piles 
up everywhere, is impossible to 
organize well, and makes one 
irritable by its mere presence. 
The real problem is not with 
standard-sized pieces of paper, 
though. Filing cabinets, binders, 
and other containers hold these 
reasonably well. It’s the little 
scraps of paper that are the real 
enemy—particularly those that 
can’t be handled, filed, or thrown 
away. 

This column previously has 
offered an overview of UNIX tools 
suitable for text handling (Au¬ 
gust, 1984). It also has touted the 
use of prototyping techniques un¬ 
der UNIX (May, 1985). Finally, it 
has speculated on the uses of 
daemons and hashed tables (July, 
1985). The current column fol¬ 
lows in the same vein, describing 
the prototyping of a UNIX tool for 
handling memoranda, and specu¬ 
lating about future work. 

MEMO 

The first version (see Figures 
la and lb) of memo is more a 
proof of concept than a usable 
tool. Still, it is a working program, 
able to copy text from standard 
input (normally taken from the 
keyboard and terminated with a 
CTRL-D) to a file named . memo in 
the user’s home directory. It even 
adds a time stamp and some 
formatting text to the message. 
Since the command can be in¬ 



voked from anywhere in the file 
system, it can be used to make 
small notes without changing 
directories. The collected memos 
can then be perused, annotated, 
or even (gasp) acted upon at the 
user’s leisure. 

The principal limitation is that 
everything ends up in the same 
file, $HOME/.memo. It would be 
much handier to have memo file 
text under specified topics, as: 

% memo thf 

This would add text to a file by 
the name of $HOME/memos/thf. 
Memos of a general nature could 
default to the topic misc , stored in 
SHOME/memos/misc. A second 
version of memo, shown in Figure 
2, implements the new option. 

If the user puts a link in the 
memos directory, data will be 
stored in the file specified by the 


link. This means that the user 
can use links in the memos 
directory to funnel text to files in 
arbitrary locations. This is an 
interesting, useful, and complete¬ 
ly serendipitous side effect. Part 
of UNIX’s fertility lies in the 
number of such “freebies” it 
gives us. 

The tool itself still lacks a bit of 
polish and bulletproofing, howev¬ 
er. It ignores extra arguments, 
and crashes if a user tries to 
invoke it without first creating 
the memos directory. Although a 
README file might alert the user 
to these problems, a better an¬ 
swer is to make the program more 
robust and self-sufficient. 

A third version, shown in Fig¬ 
ure 3, automatically creates a 
memos directory for the user. 
Being defensively programmed, it 
first checks for the existence of a 
file named memos. It also per¬ 
forms a simple argument check, 
requiring either zero or one 
arguments. 

Now that we have a version 
that can be safely handed to a 
naive user, we could easily quit. 
The code is small, relatively sim¬ 
ple, and does its job well. Unfortu¬ 
nately, creeping featurism once 
again rears its ugly head. Why not 
use the UNIX hierarchical file 
system to handle sub-topics? 

A fourth version, which is 
shown in Figures 4a and 4b, does 
this, building a tree of directories 
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: memo.vl - file memo 


m=$H0ME/.memo 

u=- 

date 

>> $m 

* date stamp 

echo $u 

>) $m 

# underline 

echo 

>> $m 

* blank line 

cat 

>> $m 

* copy stdin 

echo 

>) $m 

n blank line 


Figure la — A rudimentary 
memo shell script. 


Thu Jan 

3 13:10:31 PST 

1985 

This is 

a test... 


Thu Jan 

3 13:12:21 PST 

1985 

This is 

another test.. 


With a second line... 



Figure lb — Output from the 
script in Figure la. 


: memo.v2 - file memo under topic 
topic=$HOME/memos/${l-misc} 


date 

>> 

$topic 

echo $u 

>> 

$topic 

echo 

>> 

$topic 

cat 

>> 

$topic 

echo 

>> 

$topic 


Figure 2 — An enhanced version 
of memo capable of filing entries 
by topic. 
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to handle topics and subtopics 
specified by the user. It also 
initializes and generalizes topics 
automatically, detecting and han¬ 
dling conflicts in naming. As¬ 
sume that the user has typed: 

% memo thf/cols 

and has entered the appropriate 
memorandum. The program will 
append the output text to 
SHOME/memos/thf/cols. 

The main script (Figure 4a) will 
perform the functions of previous 
iterations of memo, adding only 
some syntax checking, a call to a 
subsidiary script (see Figure 4b), 
and a test for whether the topic 
has been generalized. The subsid¬ 
iary script will create any needed 
directories, and rename conflict- 


It's the little scraps of 
paper that are the real 
enemy—particularly 
those that can't be 
handled, filed, or 
thrown away. 


ing topic files, creating files such 
as $HOME/memos/thf/mtsc. It 
will also tell the main script 
which directory to use for the 
text. 


As you can see, the code has 
grown a bit, from 10 to almost 
100 lines. Still, it does quite a bit, 
and it is fairly bulletproof and 
convenient. So much for the 
problem of storage. Now, what 
about retrieval? 

RETRIEVAL 

The Summer 1985 proceed¬ 
ings of the Usenix Conference 
and Exposition contains, along 
with other interesting items, a 
paper entitled: “UNIX tools for a 
Personal Database” by Michael J. 
Hawley of Lucasfilm, Ltd. The 
paper describes a set of tools for 
finding files according to the 
keywords contained therein. The 
tools use a hashed inverted index, 
linking each keyword to the most 
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nication management system. Use as a 
stand-alone system or integrate with 
your word processing system. Features: 
automatic dialing, answer verification, 
message exchange, and recording of 
all status. Concurrent message trans¬ 
mission and reception. Conversational. 


APPLICATION 

DEVELOPMENT 

SSL:™ A powerful terminal-inde¬ 
pendent screen manager and appli¬ 
cation development system with many 
automatic functions such as menu 
formatting, screen and table handling, 
data base calls, and more. 

EDITORS 

SSE:™ A full screen editor for UNIX 
developers and non-technical end 
users. Features type-it-as-you-wish- 
to-see-it text entry, typewriter-like 
margin settings, and a full set of text 
editing functions. Easy to learn. 

These products are available now for 
most UNIX or UNIX-derivative oper¬ 
ating systems, including System V. 

4.2 BSD. 4.1 BSD. Xenix, Version 7. 
System III. Uniplus, and others. 


’UNIX is a trademark of AT&T Bell Laboratories. 
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: memo.v3 - file memo under topic 

if test $# -gt 1; then # too many args? 

echo Usage: memo [topic] ; exit 1 
fi 

if test -f $H0ME/memos; then n file conflict? 
echo "A file named $HOME/memos already" 
echo "exists, memo will not create" 
echo "the $H0ME/memos directory until" 
echo "the file is removed or renamed." 
exit 2 
fi 

if test ! -d $H0ME/memos; then # missing dir? 

mkdir $H0ME/memos 
fi 

topic=$HOME/memos/${1-misc} 


date 

)> 

$topic 

echo $u 

>> 

$topic 

echo 

>> 

$topic 

cat 

>> 

$topic 

echo 

>> 

$topic 


Figure 3 — A version of memo for 
automatically creating a “memos” 
directory for output . 

recent 500 (or so) references 
found. 

The principal tool is a printing 
utility, p, which searches the 
index for logical combinations of 
keywords. It has options that 
allow the printing of file data, 
names, and so forth. Another 
tool, grok, expands the scope of 
the search by using multiple 
indices, adding files to the cur¬ 
rent index, and performing sever¬ 
al other functions. Finally, a 
number of low level tools assist in 
managing the system and build¬ 
ing new applications. 

Automated keyword indexing 
is a nifty facility that handles 
many searching problems clean¬ 
ly. The only problem, a lack of an 
easy means for users to impose 
structure on their databases, can 


: memo.v4 - file memo under {sub-[topic 

* Usage: memo [topic[/subtopic... ]] 

n 

n The entered text is stored in a file named 
n topic(/...) in $H0ME/memos/. The default 
n (sub-)topic is misc. Topic generalization 

* is handled automatically. 
n 

» By appropriate use of symbolic and/or hard 

* links, actual file locations may be elsewhere. 

* 

usage= Usage: memo [topic[/subtopic...]] 
if test $# -gt 1; then * too many arguments? 

echo $usage ; exit 1 
fi 

dir=$HOME/memos 

if test -f $dir ; then * conflicting file? 
echo "$dir already exists as a file, so memo" 
echo "cannot create it as a directory."; exit 2 
fi 

if test ! -d $dir : then # missing directory? 

mkdir $dir 
fi 

topic=$|l-misc} n handle topic details 

if test -n " echo $topic i sed 

^ * tack on a trailing & 

/]*/@@)g : « kill off .../s 

s@Y/]f/]*@@~" * km off ...&s 
then echo $usage : exit 3; fi # syntax error 

cd memo.v4.s $topic~ 

topic= echo $topic i sed 

S@\*/@§T n kill off .../$ 

if test -d $topiC; then * topic is general? 

topic=$topic/misc 

fi 

date )> $topic 

echo $u )> $topic 

echo )> $topic 

cat >> $topic 

echo >> $topic 

Figure 4a — A version of memos that shows what comes of creeping 
featurism. 


be handled by memo. Truly ad¬ 
venturous types may wish to 
include all sorts of text files in the 


indexing game. If the indexing 
takes too much time, let a daemon 
do it at low priority. 
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U THE HUMAN FACTOR 


: memo.v4.s - handle general topics 

next= echo $1 i 


sed 

* kill off \..$ 

s@/[ /]*$@@" 

« kill off /...$ 

base= 


slash= 


md=$HOME/memos 


cd $md 


while test -n "$next"; 

do 

this= echo $next i 

sed 


* kill off /...$ 

newbase=$base$slash$this 

next= echo $next i 

sed " 

S 

* kill off "... 

s @/m" 

* kill off 1st / 

if test -f $this 

» conflicting file? 

then 


mv $this memotmp 


mkdir $this 


mv memotmp $this/misc 

fi 


if test I -d $this 

» no directory? 

then 


mkdir $this 


fi 

i 

cd $this 


base=$newbase 


slash= / 


done 


echo $md$slash$base # for memo.v4’s cd 


Figure 4b — A subsidiary script for memo capable of filing memoranda 
according to topic and subtopic. 


CEEGEN-GKS 
GRAPHICS 
SOFTWARE in C 
for UNIX 

□ Full implementation of 
Level 2B GKS. 

n Outputs, Inputs, Segments, 
Metafile. 

□ Full Simulation for Linetypes, 
Linewidths, Fill Areas, 
Hatching. 

□ Circles and Arcs, Ellipses 
and Elliptic Arcs, Bezier 
Curves. 

□ Ports Available on all 
Versions of UNIX. 

□ CEEGEN-GKS is Ported to 
Gould, Masscomp, Plexus, 
Honeywell, Cadmus, 
Heurikon, Codata, NBI, 

NEC APCIII, IBM-AT, Silicon 
Graphics, Pyramid, Tadpole 
Technology, Apollo, AT&T 
3B2, AT&T 6300, DEC VAX 
11/750,11/780 (4.2, 5.2), 
NCR Tower. 

□ CEEGEN-GMS GRAPHIC 
MODELING SYSTEM, An 
Interactive Object- 
Oriented Modeling Product 
for Developers of GKS 
Applications. CEEGEN-GMS 
and GKS Provide the 
Richest Development 
Environment Available on 
UNIX Systems. 

□ Extensive List of Peripheral 
Device Drivers Including 
Tektronix 4010, 4014, 4105, 
4109, HPGL Plotters, 
Houston Instruments, 
Digitizers, Dot Matrix 
Printers and Graphics CRT 
Controllers. 

□ END USER, OEM, 
DISTRIBUTOR DISCOUNTS 
AVAILABLE. 

CEEGEN CORPORATION 

20 S. Santa Cruz Avenue. Suite 102 
Los Gatos, CA 95030 
(408) 354-8841 
TLX 287561 mlbx ur 

EAST COAST: 

John Redding & Associates 
(617) 263-8206 
UNITED KINGDOM: 

Tadpole Technology PLC 
044 (0223) 861112 
UNIX is a trademark of Bell Labs. 
CEEGEN-GKS is a trademark of 
Ceegen Corp. 
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Prototyping has a way of be¬ 
coming addictive. Each new fa¬ 
cility suggests others, and the 
prototyper may never be able to 
say that an application is really 
“done”. Still, any working snap¬ 
shot of a prototyped system can 
be used or even shipped, and the 
users need not know all of the 
prototyper’s dreams. 

Mailfor Mr. Morin can be sent 


to the Canta Forda Computer 
Lab, PO Box 1488, Pacifica, CA 
94044. 


Richard Morin is an independent 
computer consultant specializing in 
the design, development, and docu¬ 
mentation of software for engineer¬ 
ing, scientific, and operating sys¬ 
tems applications. He operates the 
Canta Forda Computer Lab in Paci¬ 
fica, CA. ■ 









YOU CHOOSE: 


Terminal Emulation Mode 

MLINK 

CU/UUCP 

Menu-driven Interface 

Yes 


Expert/brief Command Mode 

Yes 

Yes 

Extensive Help Facility 

Yes 


Directory-based Autodialing 

Yes 


Automatic Logon 

Yes 

Yes 

Programmable Function Keys 

Yes 


Multiple Modem Support 

Yes 

Yes 

File Transfer Mode 

Error Checking Protocol 

Yes 

Yes 

Wildcard File Transfers 

Yes 

Yes 

File Transfer Lists 

Yes 

Yes 

XMODEM Protocol Support 

Yes 


Compatible with Non-Unix Systems 

Yes 


Command Language 

Conditional Instructions 

Yes 


User Variables 

Yes 


Labels 

Yes 


Fast Interpreted Object Code 

Yes 


Program Run 

Yes 


Subroutines 

Yes 


Arithmetic and String Instructions 

Yes 


Debugger 

Yes 


Miscellaneous 

Electronic Mail 

Yes 

Yes 

Unattended Scheduling 

Yes 

Yes 

Expandable Interface 

Yes 


CP/M, MS/DOS Versions Available 

Yes 



MLINK 


The choice is easy. Our MLINK Data Communications System is the most powerful and 
flexible telecommunications software you can buy for your Unix™system. And it’s easy 
to use. MLINK comes complete with all of the features listed above, a clear and com¬ 
prehensive 275-page manual, and 21 applications scripts which show you how our 
unique script language satisfies the most demanding requirements. 


Unix System V 

BSD 4.2 

MS-DOS 

Unix System III 

Xenix 

CP/M 

Unix Version 7 

VM/CMS 

and more. 

Choose the best. Choose MLINK. 


Altos 

Data General 

IBM 

Arrete 

DEC 

Onyx 

AT&T 

Kaypro 

Plexus 

Compaq 

Honeywell 

and more. 


MLINK is ideal for VARs and application builders. Please call or write for information. 



Corporate Microsystems, Inc. 


P.O. Box 277, Etna. NH 03750 


(603) 448-5193 


MLINK is a lr.nlrm.irk of Corporate Mirrosyslems. Im. Unix is a tiadrmark ol ATM Bell lahoiatorie 
trademarks of Microsoft Corp. C P/M is a registered trademark of Digital Research. 


. IBM is a legistered trademark of IBM Corp. MS-DOS and 
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FROM NOWON, CONSIDER IT SUPPORTED 


When it comes to Unix® systems, 

“standard” isn’t always good enough. 

Experts agree that the most powerful and most tech¬ 
nically advanced Unix system is the Berkeley version. 
That’s why 4.2BSD from Berkeley is the operating system 
of choice for software development, networking, engi¬ 
neering, CAD/CAM and demanding scientific applica¬ 
tions. Other Unix systems don’t have the features 
advanced users require. 

But 4BSD was developed at a university, so it has never 
had real-world support. User assistance, bug fixes, 
updates and enhancements have not been provided. 

Now that’s changed. 

MT XlNU, the4BSD specialist, supplies: 

■ Fully supported 4.2BSD-based binary licenses 
(MORE/bsd) for VAX® computers. 

■ 4.2BSD source support and source updates for current 
4.2BSD source licensees. 


“We know UNIX® Backwards and Forwards” 



UNIX” SUPPORT FROM BERKELEY 


. . . .. . . . Circle No. 7 on Inquiry Card 

739 Allston Way, Berkeley, CA 94710 ■ 415/644-0146 ■ ucbvax!mtxinu!mtxinu 

MORE/bsd and MT XlNU are trademarks of Mt Xinu Inc., DEC and VAX are trademarks of Digital Equipment Corp., UNIX is a trademark of Bell Laboratories. 


■ Enhanced 4.2BSD-based source software for new 
sites, with or without redistribution rights. 

■ Full support for a wide variety of DEC® and non-DEC 
peripherals. 

■ Assistance for OEM’s and hardware manufacturers 
developing 4.2BSD-based products. 


MT XlNU personnel have been involved with 4BSD 
development from the beginning. Now we are producing 
4BSD performance enhancements, advanced network¬ 
ing, other Unix system extensions, and support for new 
peripherals and architectures. As a service, we distribute 
4BSD bug reports and proposed bug fixes to the com¬ 
munity. Our years of experience can speed and improve 
your4BSD implementations. 


4.2BSD. It’s always been better than just 
“standard.” Now, with MT XlNU, consider 
it supported. 







verdix 

DEVELOPMENT SYSTEM 


Why the DoD mandated 

Ada. When the Department of 
Defense mandated Ada for 
embedded and mission-critical 
systems development, there was 
good reason. This reusable, high- 
order language can put an end 
to the Software Crisis. Ada de¬ 
creases skyrocketing software 
costs, improves management and 
control, reduces life cycle costs, 
boosts productivity, dramatically 
reduces errors and cuts training 
costs. Ada is the language of the 
day, and Verdix speaks it. Louder 
and clearer than anyone. 

Why others are mandating 
Verdix. We have the first highly 
portable production-quality Ada 
development system. The Verdix 
Ada Development System 
(VADS™) is the first production- 
quality Ada compiler system to 
meet DoD’s stringent require¬ 
ments for Ada language mission- 
critical systems development. It is 
now available on the DEC VAX™ 
series computer systems and 
includes: 


• High performance, rehost- 
able/retargetable Ada com¬ 
piler under UNIX™ 4.2 
BSD and ULTRIX™ (soon 
under VMS) with excellent 
diagnostics; 

• Symbolic Debugger; 

• Library Management 
Utilities; 

• Run Time System. 

Designed for large scale and 
embedded systems development, 
VADS speeds programming 
faster than any other Ada 
compiler. 



VADS also helps programmers 
quickly learn the Ada language. 
The unexcelled diagnostics speed 
correction time and shorten 
development time. The Symbolic 
Debugger lets you watch your 
program execute, Ada line by 
Ada line or machine instruction 
by machine instruction, even for 
remote embedded systems. And 
it’s highly portable: VADS will 
be available on a wide range of 
computer systems. 

Fully DoD validated; avail¬ 
able now. VADS is validated and 
ready to be put to work. It’s not a 
promise. It’s available now (as in 
“here and now”) and already in 
use by major DoD contractors. 

To find out for yourself how 
the Verdix Ada Development 
System can work for you, write, or 
call (703) 448-1980 and talk to 
Howard Nevin, Vice President 
of Product Planning and Cor¬ 
porate Development for more 
information. 


IT'S VALIDATED. USE PROVEN. AVAILABLE NOWI 


VERDIX’ 

Ada Development System ^ 


Verdix Corporation 

14130 Sullyfield Circle 
Chantilly, Virginia 22021 
Tel: (703) 378-7600 


© 1985, Verdix Corporation. Ada is a registered trademark of the US Government (Ada Joint Program Office). UNIX is a trademark of Bell Laboratories. 
Verdix and VADS are trademarks of Verdix Corporation. ULTRIX is a trademark of Digital Equipment Corporation. 

Sun Workstation is a registered trademark of Sun Microsystems, Inc. 
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ATTERNS 
IN DATA 
MANAGEMENT 


The intricacies of database systems 

by Eric Allman 


ne of the earliest applications of 
computer equipment was for storing, organizing, 
and retrieving information. By its very nature, the 
computer is adept at accurately performing tedious 
jobs. A task such as a search through documents for 
a particular piece of information is the very sort of 
thing that humans do poorly. Unlike the computer, 
the human mind drifts; digits can be transposed and 
sheets of paper tend to get missorted. 

Early on, IBM proved particularly astute in 
exploiting the power of the computer for database 
activities. Up to that time, database management 
was often machine-assisted (using equipment such 
as card sorters and collators), but processors 
themselves tended to be limited to scientific tasks 
requiring substantial mathematical computation. 
Then came the IBM 1401, which differed radically 
from previous computers: I/O was emphasized, even 
more so than computational ability; most instruc¬ 
tions operated on sequences of bytes rather than on 
words; and operations were included in the hard¬ 
ware to perform such functions as comma insertion 
and leading blank suppression. 

Database technology has grown in three direc¬ 
tions since those days. First, the underlying hard¬ 
ware has become faster, smaller, and cheaper. In 
some cases this has been used to provide dramatic 
“brute force” systems such as ‘‘processor per 
track” technologies that scan all the data read in 
from disk. Second, the techniques for organizing 
data have become more sophisticated. In particular, 
the assumption of a linear media such as cards or 
tape is now the exception rather than the rule. 
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DATABASE OVERVIEW 


New generations of tools are 
eliminating the need to consult a 
programmer in order to access data. 

Third, new generations of tools are eliminating the 
need to consult a programmer in order to access 
data. 

Today, the database market tends to be divided 
into three camps. The “big guns” camp usually 
requires huge machines, facilities for storing data¬ 
bases measured in gigabytes, and large staffs of 
programmers. This camp includes credit card 
companies, the IRS, and airline reservation sys¬ 
tems. Their databases are typically very traditional, 
leaning toward numbers and fixed-size character 
fields. They can often be accessed by tens of 
thousands of people at once. Unsurprisingly, invest¬ 
ments in such systems can easily run into tens of 
millions of dollars. 

At the opposite extreme are the many small 
database systems currently popular on micros and 
minis. These databases are small, measured in 
kilobytes, and usually can fit on a single diskette. 
Programming is often done using a “fill in the 
blanks” style of interface that can be used with 
minimal training. In many cases, non-traditional 
data types such as text and graphics can be 
accommodated. However, small systems are usually 
single-user, slow, and limited to small databases. 
The investment seldom exceeds a few thousand 
dollars. 

Between these two extremes, a new class of 
supermini-scale systems has arisen. They have 
many of the features of large systems, such as 
multiuser access, good performance, and ample 
programming language access. Many also include 
some of the nicer features of the small systems, 
such as applications generators and non-traditional 
data types. Such databases are typically in the 1 to 
100 MB range and cost $10,000 to $100,000. 

DATA MODELS 

Database management systems typically have a 
preferred way of organizing data. This is called the 
data model. For example, a file cabinet lends itself 
naturally to a particular way of organizing data. 
There are many data models, but three are both 
popular enough and different enough to be interest¬ 
ing: the hierarchical model, the network model, and 


the relational model. 

The hierarchical model most closely approxi¬ 
mates a tree. For example, the UNIX file system is a 
flexible form of hierarchy: every node in the file 
system has a unique parent and may have some 
number of children. (It departs from a strict 
hierarchical model, though, when links are consid¬ 
ered since files can be created that exist in more 
than one directory.) 

Hierarchies have obvious physical correspon¬ 
dences, making the transition from manual to 
automatic systems convenient (for example, a file 
cabinet is a hierarchy—a cabinet contains drawers, 
a drawer contains files, a file contains pages, and so 
on: moreover, a file can be in at most one drawer, a 
page in at most one file, and so forth). In the early 
days of computing, hierarchies were convenient 
since they could be represented easily on linear 
mediums such as punched cards and magnetic 
tapes. 

However, hierarchies are rife with problems. 
Finding information that has been classified under 
a different heading than you have can require an ex¬ 
pensive sequential scan of the entire database. 
Since a piece of information can only exist in one 
place, data must be duplicated if it logically appears 
in more than one location. This complicates 
updates: when a datum is updated, all copies must 
be found and updated. 

The network model represents an attempt to fix 
the data duplication problem inherent in the 
hierarchical model by allowing arbitrary pointers. 
This has the extremely desirable property of 
allowing data to appear under more than one 
heading. For example, the description of the parts 
making up an assembly could reasonably appear in 
three places: in the file for the assembly, once in 
each of the files for the parts themselves, and once 
in each of the files for the manufacturers supplying 
the parts. 

The network model—like the hierarchical mod¬ 
el—has update problems. When a datum is deleted, 
all pointers to it must be found and deleted. UNIX 
solves this problem by deleting references rather 
than the data itself: only when the last reference is 
deleted is the data actually deleted. In database 
terms, this is called an “update anomaly”, since the 
delete action functions differently depending on the 
state of the database. Update anomalies can 
occasionally be useful, as in this example, but they 
normally are harmful and should be avoided. 

Functioning network database systems such as 
CODASYL include reference counts and back 
pointers so that deletion of a datum can also delete 
all references to that datum. The cost of this is 
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Order the UNIFY 
Demo Kit, and learn 
why UNIFY is the 
DBMS chosen by more applica¬ 
tions developers and UNIX-based 
computer vendors. 

You can interactively demo 
UNIFY’s speed advantages, made 
possible by its automatic selection 
of four access methods. 

You can see the simplicity of 
UNIFY’s menu-based design and 

MTU** 


details UNIFY’s 
easy forms design, 
multiple security fea¬ 
tures and unmatched 
host language inter¬ 
face—and proves 
why this is the one DBMS that can 
keep pace with your needs. 

Test it yourself. The Demo Kit 
includes disk or tape, demo hand¬ 
book, plus comprehensive manuals 
that show how to build virtually 
any application—all for $150. 

MF " 


ace, Lake 

503/635-6265 


































































































Call Integrated Solutions. 


high-performance, VME based 
workstation that's easy to 
to the MC68020. 


Upgrades to the new Motorola 
68020 are as easy as a single 
board swap in the Optimum V 
Series. And when you're 
ready to add on even more 
processing power, you have 
a wide range of sophisticated 
Integrated Solutions' products 
to choose from. Peripherals. 
Packaging alternatives. 
Performance features. 
Applications. All at low 
incremental cost. 


high-speed memory and 
UNIX 4.2 BSD, the Optimum V 
is designed to provide you, 
the technical professional, 
with state-of-the-art 
technical processing. 

And Integrated Solutions 
makes processing even easier 
with a unique user interface. 
Multiple windows, icons and 
pop-up menus mean con¬ 
venient access to programs, 
files and directories. 


High-performance 
graphics for the 
technical 
professional. 

The Optimum V offers 
impressive features. Like a 
high-resolution (1280 xl024), 
bit-mapped graphics display 
in monochrome or color. 
32-bit VME architecture and 
AMD 29116 graphics display 
processor. Together with 
11.2 MHz 68010, dual ported 


Support and more. 

Integrated Solutions is behind 
you all the way with on-site 
hardware maintenance and 
local software support. Call us 
now for more information. 

Integrated Solutions 

An NBI Company 
2240 Lundy Avenue 
San Jose, CA 95131 
800-538-8157, ext. 823 
In California, 800-672-3470, 
ext. 823 
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DATABASE OVERVIEW 




Database management systems 
typically have a preferred way of 
organizing data. 


potentially enormous. In general, network database 
systems are so gargantuan that they either require 
huge processors or require that the user handle 
“strange cases”. 

Hierarchies and networks share one particularly 
annoying property: the physical representation of 
data (that is, what I can get to quickly from where I 
am now) is intertwined with the data’s logical 
organization. For small applications or applications 
that are well understood in advance, this is not a 
problem, but if the organization of the data (the 
schema in database parlance) changes, all pro¬ 
grams accessing that data may need to be rewritten. 
The property we are looking for is called data 
structure independence. For example, UNIX rou¬ 
tines like getpwnam permit relatively trivial inser¬ 
tion of hashed password files since they hide the 
physical structure of the data, whereas a change in 
the directory format requires changes to many 
programs that know the physical format of 
directories. 

A relatively recent development is the relational 
model. It was originally considered little more than a 
mathematical curiosity, since it was “obviously” too 
inefficient to actually implement—much as tree- 
structured file systems, device independence, and 
dynamic processes were “obviously” too inefficient. 
As a result of this genesis, a large amount of the lan¬ 
guage surrounding the relational model is math¬ 
ematical rather than intuitive. 

In the relational model, data is structured as 
tables. These tables are physically disjoint from 
each other, although they may be logically related. 
For example, a database describing the parts 
making up an assembly might have one relation 
(“table") containing the list of parts making up each 
assembly, another relation containing the list of 
suppliers that supply each part, and so on. Connec¬ 
tions are made using logical links: to find the list of 
suppliers that make parts for a given assembly, find 
the set of parts that will be required, then find the 
list of suppliers making those parts. 

There are several important points to this 
example. First, the data language used to access the 
database is normally non-procedural (that is, it 
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describes what data is wanted rather than how it is 
obtained), and set-oriented rather than datum- 
oriented. Second, the relational model depends on 
the existence of efficient search structures. Third, 
key data is duplicated; the part number is listed 
both in the “assembly” relation and the "supplied- 
by" relation. Fourth, the data structures can be 
changed transparently, since the users never say 

follow that pointer” in their programs. 

1 he relational model of data has become popular 
because of its flexibility and simplicity. Almost all 
the database products available on UNIX today are 
relational systems: to reflect this, the remainder of 
this article will focus on this type of system. 

First, though, there are several terms that bear 
description: 

A relation is a collection of semantically related 
data. For example, the /etc/passwd file is an 
example of a relation matching a login name (the 
unique key ) to information about a user. Relations 
are sometimes called tables by analogy to the 
convenient printed representation of a relation. 

A single entry in a relation is called a tuple , short 
for "n-tuple”, taken from mathematical usage. It is 
sometimes called a record (from the obvious data 
processing analogy) or a row (from the “table" 
analogy). 

Each of the individual pieces of data in a tuple is 
called an attribute, another word borrowed from 
the mathematical model. In database-land there is 
nothing smaller than an attribute, since if you could 
subdivide an attribute, you would be creating a 
hierarchy. Attributes are sometimes called fields-, 
the table analogy would call them columns. For 
example, the/etc/passwd “relation” has seven 
attributes: user name, password, user id. group 
id. geos, home directory, and shell. 

BASIC OPERATIONS 

All database systems provide a few basic opera¬ 
tions. Without these, the system would be unusable. 
However, different systems express the operations 
in different ways. In some cases these operations 
may be so hidden by the user interface that they are 
not obvious at all. 

Of course, the most obvious operation is to 
retrieve data (this is also called select, get, or query 
in some systems). Conditions can normally be 
applied to limit the amount of data that is returned. 
For example, a retrieval might ask the system to 
“show the employees who work in department 23” 
or "retrieve all employees who earn more than their 
manager”. 

Data can be added to a relation using the append 
(also called insert or add) operation. For example, 





Learn this integrated office program on one 
system, and you’ll already know three more. 
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R Office ties a powerful 
word processor together 
with the file management, 
table spreadsheet, report genera¬ 
tion and desktop management chores 
common to nearly every modern office. 

Incredibly, the UNIX version is identical 
to the DOS version. And the RM/COS. And the 
XENIX. 

Now you can switch from a stand-alone PC to 
multi-user computer systems without missing a beat. You 
take all the functions, all the commands, all your learning 
with you. 

Yates Laboratories gave R Office highest scores in its 
recent Competitive Edge report, praising its "straightforward 
and efficient user interface which products ^ 

designed by committee typically lack." Kx ■ 

It's the only office automation program 


many companies need 
to own. (And the only one 
their suppliers need to 
support.) 

We designed R Office to 
make the most out of very little mem¬ 
ory, too. After the first 320K bytes of RAM 
for installation, each additional terminal 
requires only 32K. (Compare that with other, 
so-called multi-user software.) 

R Office is already helping small companies and 
major corporations improve their office productivity. And 
unlike hopeful imitators, R Office is available, today, to help 
yours. 

Write for information to R Systems, Inc., 11450 Pagemill 
™ Road, Dallas, Texas. Or phone toll-free, 
I"W**d(800) 527-7610. In Texas, call collect, (214) 
343-9188. 
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R Office 

BY R SYSTEMS/ I NC. 


UNIX AND COS ARE TRADEMARKS OF AT&T LABS AND RYAN-McFARLAND CORR. RESPECTIVELY. XENIX IS A REGISTERED TRADEMARK OF MICROSOFT, INC. 
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DATABASE OVERVIEW 


The relational model of data has 
become popular because of its 
flexibility and simplicity. 

“add Eric Allman with an initial salary of 
$ 200 , 000 ”. 

The delete operation can be used to remove 
existing data. For example, “fire everyone with 
salaries over $20,000”. 

Data can be changed using the replace operation 
(also known as modify or update). A good example 
might be ‘‘give all programmers a 40 percent raise”. 
The replace operation can be simulated using a 
delete followed by an append , but most systems 
supply this function as a primitive operation. 

In addition, some database systems support more 
complex operations, although normally they are not 
given keywords in the language. 

The projection operation is used to select certain 
fields from the records. For example, “give me 
names and salaries (but discard the rest of the 
information)”. 

Restriction limits the number of tuples to be 
retrieved. For example, “just give me the informa¬ 
tion about the people working in software”. 

Join matches information from one relation 
against another relation. For example, “match 
employee information against department informa¬ 
tion” (this is normally used in conjunction with 
restriction, so that a real query might be something 
like: “give me employees who work in departments 
with sales over $1 million”). 

Aggregation is used to summarize data. For 
example, “give me a count of employees in soft¬ 
ware” (as opposed to a list of those employees) or 
“what is the average salary in my company?” 

INTERFACES 

Many kinds of user interfaces to database 
management systems exist. Most sophisticated 
database systems have many different user inter¬ 
face modules, varying from very powerful modules 
that require a great deal of user sophistication to 
modules that can be used with a minimum of 
training but have correspondingly less power. For 
example, a system may offer a programmer inter¬ 
face, an end user query facility, and an applications 
generator. 

Ad Hoc Query Language. An ad hoc query 


language allows a user to enter queries in a database 
language such as SQL or IDL. These languages 
require a fair amount of training to use. Popular 
languages today are non-procedural, which is to say 
they describe the data to be accessed without 
describing how to find it. For example, to find the 
names and salaries of all employees of the toy 
department, one might enter in IDL: 

range of e is employee 
range of d is department 
retrieve (e.ename. e.salary) 

where e.dno = d.dno 
and d.dname = "toy": 

Embedded Programming Language. In order to 
build up more powerful programs, it is often popular 
to embed the database sublanguage into a general 
programming language. For example: 

showsalaries() 

/ 

1 

S char name[50] : 

$ int salary? 

$ range of e is employee: 

$ retrieve ($name = e.name, $salary = e.salary) 

$ { 

printf("name=%s, salary=%d\n". name, salary): 

* } 

i 

l 

These interfaces require even more training than ad 
hoc query languages. 

Query by Example. This popular interface asks 
the user to fill out an example of what the desired 
output should look like. For example, if the user 
draws a box on the screen with columns headed 
“ename” and “salary” and puts “Eric Allman” in 
the first column and a question mark in the second 
column, QBE will assume that this means “give me 
Eric Allman’s salary”. Fairly minimal training is 
required, but complex queries are almost impossible 
to express. 

Browsers. Browsers display a single record at a 
time. A user can then update the values on the 
screen and ask the database system to change the 
tuple accordingly. Browsers are extremely useful for 
a number of common applications. 

Some browsers require that a semi-sophisticated 
user set up the screen format in advance, after 
which naive users can access the data. More clever 
browsers will set up screens themselves, so that 
they can be used immediately by naive users. 
Application Generators. Many applications 
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have a number of common features that cannot be 
adequately handled by a browser. In these in¬ 
stances, an application generator provides a frame¬ 
work in which programs can be written. They vary 
from extremely simple packages to forms-based 
programming environments. In most cases, a 
medium-sophisticated user can use an applications 
generator. The applications that result can general¬ 
ly be used by very naive users. 

Report Writers. Businesses lust for reports, so 
naturally a separate class of interfaces is entirely 
devoted to producing nicely formatted reports— 
including columns of figures, page headers and 
footers, subtotals, duplicate value suppression, and 
whatever else the latest rage might be on Wall 
Street. Most report writers provide default formats 
that naive users can use to produce reasonably 
pleasing reports, with lots of hooks to provide fine 
control over format elements that are intelligible 
only to the initiated. 

Special Purpose Interfaces. The world is filled 
with special-purpose interfaces. These can vary 
from extremely simple ones (such as Automatic 
Teller Machines, usable by a wholly untrained 
public) to extremely complex ones (like the control 
program for a “factory of the future”). 

FEATURES AND TRADEOFFS 

A wide variety of features are available in 
database systems currently on the market. These 
features can be very important if you need them, but 
in virtually every case they come with an associated 
cost. 

Handling Large Databases. The grep program 
is fine for small databases (less than a few thousand 
“records”) that have moderate performance re¬ 
quirements. For larger databases or databases 
requiring fast access, more powerful access meth¬ 
ods will be required. For example, the look program 
uses a binary search algorithm on the dictionary 
and the dbm routines use hashed indices. 

Arithmetic Capability. The need to do simple 
arithmetic inside the database management system 
is common. For example, you might need to 
computer “age = 1985 - birthyear” or “metres = 
feet * 0.3048”. The awk program exemplifies this 
capability. 

Aggregation. Many applications need a sum¬ 
mary of data rather than a slew of raw values. For 
example, the wc program produces a summary of its 
input data. Aggregates can be simple, such as 
average salary or maximum age, or they can return 
a set of values, such as total population by country 
(returning one value for each country in the 
database). 


The awk program includes the ability to compute 
these aggregates using a procedural interface. High- 
level languages provide these as primitives, such as 
“avg(emp.salary)” to find the average employee 
salary. 

Data Structure Independence. Contrary to 
popular belief, computer professionals are not 
omniscient. They often fail to properly anticipate 
actual reference patterns. Data structure indepen¬ 
dence affords the ability to change the “fast access 
paths” without changing existing programs. For 
example, if /etc/passwd were hashed on login 
name, but it later became clear that it would be bet¬ 
ter to produce a B-tree on user id instead, it would be 
nice if the change could be made without all the old 
programs being affected in the process. This feature 
is common on relational systems, but rare on other 
types of systems. 

Multi-File Capability. It is often necessary to 
correlate data between files. This requires a more 
complex query language capable of expressing the 
appropriate queries. Processing, of course, gets 
somewhat more complicated along the way. The 
join command in UNIX is an example of such a 
program. 

Concurrent Access. In commercial settings, it is 
common for many people to access the same 
database at the same time. Some control must be 
provided to make sure these people do not destroy 
each others’ work. This is usually provided with 
some sort of locking mechanism. With locking, 
though, comes the potential for deadlocks, so part of 
the cost of this feature includes deadlock detection 
and resolution algorithms. 

Crash Resilience. If your data is very valuable, it 
is important that it be left in a consistent state in the 
event of a system crash. The usual definition of 
“consistent” is that “the update I was executing 
either should be completed or backed out altogeth¬ 
er”. To provide this, the database system must have 
a notion of the commit operation—that is, it must 
be able to atomically specify that an update is to be 
finished rather than backed out. Since all the 
appropriate data must be on disk, this implies a 
sync operation as well. Finally, all changes must be 
logged for the duration of the query so that they can 
be backed out if necessary. 

Transactions (or Atomic Multiple Commands). 
Often an operation that should be considered 
atomic must actually be implemented using several 
smaller operations. For example, to transfer money 
from one account to another, each account must be 
updated. In the middle of the transfer, there is a brief 
moment when the money either disappears entirely 

Continued to Page 96 
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DBMS on the UNIX trail 


by Roger J. Sippl 


On its way to becoming a 
standard commercial operating 
system, UNIX has sometimes 
been criticized for not being par¬ 
ticularly “commercial”. Some 
say that since UNIX was devel¬ 
oped in a research lab, perhaps it 
was meant to stay there. To these 
people, it seems clear that UNIX 
was never designed to be run on 
computers used for keeping re¬ 
cords, ledgers, and other “large 
DP type” information. For them, 
programming productivity and 
engineering document prepara¬ 
tion seem more in keeping with 
the system’s flavor. But are they? 
In the years since UNIX moved 
into spheres outside of the re¬ 
search lab, what has it shown 
itself to be truly good at? 

It’s an interesting question 
with many sides to consider. For 
the moment, let’s focus on the 
ways in which UNIX has become 
commercial. Certainly the ability 
to support data processing appli¬ 
cations and to make large corpo¬ 
rate information resources avail¬ 
able to a large body of users are 
important criteria for commercial 
success. The best way to see if 
UNIX meets the test is to evaluate 


whether database management 
systems can be built to run—and 
run well—on it. If so, one can be 
assured that UNIX is legitimately 
commercial, since all commercial 
applications will ultimately be 
built using DBMS tools. 

In the early days of UNIX, there 
was some question as to whether 
it could meet this test. Shortcom¬ 
ings existed—some of which still 
have not been completely solved. 
The story to tell here, though, is of 
how a non-commercial operating 
system has been commercialized, 
mostly through standards born of 
necessity. 

We are nearing the end of the 
changes UNIX must undergo as 
part of the process, so this is a 
good time to recap some of the 
issues that have been addressed 
over the last five years. At times it 
will be necessary to discuss spe¬ 
cific implementation issues and 
how they have been addressed. 
Since my own company’s prod¬ 
ucts are best known to me, I will 
occasionally mention how these 
issues have affected the architec¬ 
ture of the Informix DBMS and 
the C-ISAM indexed file system 
subroutine library. By no means, 
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though, should it be thought that 
these are issues limited to these 
products alone. 

The major database issues that 
have been addressed during the 
commercialization of UNIX have 
included: 

1) Record locking. This is prob¬ 
ably the best known. 

2) Languages. In the corporate 
and government data process¬ 
ing world, COBOL and DBMS 
go hand in hand. The original 
definition of DBMS taken from 
the mainframe environments 
of the 1960s and ‘70s was, in 
fact, one comprised of subrou¬ 
tine interfaces to COBOL. Nev¬ 
ertheless, UNIX has a heavy C 
bias that often shows. 

3) The file system. The UNIX 
operating system actually used 
to limit the size of a file to 1 MB 
so that users wouldn’t get car¬ 
ried away. To get database 
packages to run, some system 
administrators actually had to 
turn off this protection against 
“runaway” programs. Early 
UNIX implementations also 
sometimes imposed “indirec¬ 
tion” overhead whenever they 
encountered big files. 

4) Multiprocessing architecture 
and shared memory. DBMS 
code is usually pretty big, and 
it likes to buffer lots of disk 
data. Can UNIX handle the 
challenge? 

RECORD LOCKING 

One of the strengths of UNIX is 
that it offers multiuser capabili¬ 
ties. Thus, it is imperative that 
any database system designed 
to run under UNIX be able to 
operate properly in a multiuser 
environment. 

What does it mean to “operate 
properly”? The answer is not 
obvious. It turns out that a funda¬ 
mental problem arises when sev¬ 
eral users change the same body 


Each new machine can 
often pose an 
adventure in record 


locking. 


of information at the same time. 
Fortunately, there are ways to 
solve this, but note that I say 
“ways” rather than “way” be¬ 
cause not all versions of UNIX 
agree on the same solution. To 
illustrate the problem and a typi¬ 
cal solution, let’s look at an 
example. 

Consider a database system 
with a screen-oriented data entry 
and inquiry module (most com¬ 
mercial UNIX database products 
contain such a package). In our 
example, the user has designed a 
database file to contain records 
that list a customer name and a 
corresponding receivable, like so: 

Customer Name Amount Owed 


John Smith 20.00 
Paul Stevens 230.00 
Jane Doe 750.00 
Mary Q. Public 570.00 


Now consider a data entry and 
maintenance screen that allows a 
clerk to find a record, update it, 
and write it back to the database. 
The actual sequence of operation 
follows: 


1) The user presses the “Q” key, 
which represents the screen 
package’s “query” command. 
The user is presented with a 
screen that allows part or all of 
the customer’s name to be 
entered into a customer name 
field. With this information, 
the screen package software 
can then search the database 


for that customer’s record and 
return it to the screen. 

2) The clerk then strikes the “U” 
key (for “update”) and has the 
option of changing any infor¬ 
mation on the screen. 

3) By striking the ESC key, the 
operator can tell the screen 
package to write the updated 
record to the database stored 
on disk. 

Let’s look a bit closer at what is 
actually going on inside the com¬ 
puter system over the course of 
these three steps. Assume the 
record of Paul Stevens is the one 
to be changed. Prior to step 1, it 
resides on disk, having no reason 
to be read into main memory. 

When the search is performed 
as part of the query in step 1, 
though, Mr. Stevens’ record is 
found and read into main mem¬ 
ory. At this point, one copy of the 
record exists in main memory, 
while another remains on disk. 

In step 2, the clerk changes the 
record on the screen. This modi¬ 
fies only the copy of the record 
contained in main memory. Only 
after step 3 is executed can the 
main memory copy of the record 
be written back to disk. Thus, 
only after the operation is com¬ 
plete will the main memory copy 
and the disk copy agree. (In most 
UNIX systems, a “buffer flush¬ 
ing” issue must also be consid¬ 
ered as part of this scenario, but 
let’s neglect this subtlety for the 
time being since it does not affect 
our fundamental concern.) 

Note that no problem arises 
from the fact that the main 
memory and disk copies differ for 
a short period of time. Even if the 
clerk should go on a coffee break 
without first telling the screen 
package that an update was com¬ 
plete, two copies of the same 
record could differ for a fairly 
lengthy period without trouble 
arising. 
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But what if we now introduce a 
second clerk? Let’s say both clerk 
A and clerk B are updating the 
same record concurrently. Kindly 
note that “concurrent” does not 
mean “simultaneous”. A com¬ 
puter’s CPU can only do one thing 
at a time. To say that two events 
are taking place concurrently 
means that the second event 
begins before the first event fin¬ 
ishes. Since UNIX offers time- 
slice multiuser and multitasking 
functionality, it can run programs 
concurrently. The switching be¬ 
tween processes is fast enough to 
be transparent, making it appear 
as though processes occur simul¬ 
taneously. If this weren’t the case, 
UNIX would have gained precious 
little acceptance as a multiuser 
system. 

Returning to our two clerks, 
let’s say clerk A reads the Paul 
Stevens record from disk into 
main memory and changes the 
$230.00 to $1000.00 since a 
new invoice has been sent for 
$770.00. However, before clerk A 
gets the opportunity to write the 
record to disk by striking the ESC 
key, clerk B reads another copy of 
the Paul Stevens record into main 
memory. This leaves us with 
three copies of the record: one on 
disk, one in clerk A’s main mem¬ 
ory “data area”, and a third in 
clerk B’s data area. 

At this point, clerk B changes 
the balance from $230.00 to 
$235.00 to reflect a new $5.00 
bill for Paul Stevens. 

Unknown to the clerks, they 
have become participants in a 
race. If clerk A writes the Paul 
Stevens record to disk first, clerk 



will overwrite the $1000.00 one 
already stored on disk. If, on the 
other hand, clerk B writes the 
record first, clerk A’s $1000.00 
balance will be the amount stored 
on disk. The true balance, of 
course, should be $1005.00. 

The fundamental problem this 


Almost every 
manufacturer with a 
locking problem is 
actively working on it. 


scenario illustrates is known as 
“concurrency control”. There are 
several places in a multiuser 
operating system where concur¬ 
rency control can become a prob¬ 
lem. Besides the problem with 
concurrent updates, several us¬ 
ers might choose to send jobs to 
the same printer concurrently. 
Most multiuser operating sys¬ 
tems provide solutions for this 
fundamental problem by using 
“locks”. 

Locks do not apply when some¬ 
thing is merely read into main 
memory. But once an update is 
initiated, screen package soft¬ 
ware should lock the record. 
Locking a record usually doesn’t 
mean that other users can’t read 
it, but it does mean that other 
users can’t lock the same record. 
Once an update is complete, 
though, the record will be written 
back to disk, and the screen 
software will unlock the record so 
that other users can modify it. 

To see this in practice, let’s 
take one more look at the two- 
user scenario, using the assump¬ 
tion that the system offers the 
ability to lock files. Clerk A reads 
the record and begins to update it. 
The Paul Stevens record thus is 



update it. Since the record is 
already locked, the screen soft¬ 
ware acknowledges this by print¬ 
ing a message on the clerk’s 
terminal explaining that the rec¬ 
ord is temporarily unavailable, 
and that it would be best to try the 


update again shortly. 

Notice that the means for affix¬ 
ing a lock, detecting a lock, and 
removing a lock really must re¬ 
side in the operating system it¬ 
self, as personified by subroutine 
calls to the various programming 
languages running on the system. 
The lockingjob falls to the operat¬ 
ing system because all the pro¬ 
grams running on the computer 
need to coordinate their activities 
through “lock tables”. It would be 
unreasonable, probably impossi¬ 
ble, for all the programs running 
on a computer at the same time to 
make inquiries of all other pro¬ 
grams on the system before lock¬ 
ing a record. 

This may make it seem obvious 
now that record locking is a job 
for the operating system, but it 
took a long, hard battle to get 
such facilities included in com¬ 
mercial versions of the UNIX 
operating system. Not until Sys¬ 
tem V Release 2 was it that an 
official description of a locking 
call was included. 

Nevertheless, some database 
management systems available 
on UNIX have been providing 
multiuser database functions for 
five years. How have they done 
this without a locking call? The 
truth is: they haven’t. The void 
left by official releases of UNIX 
actually has been filled by a 
number of vying alternatives. 
The locking calls for the Fortune, 
Onyx, and Plexus machines are 
all the same. But the locking call 
for Xenix differs from these, and 
the Zilog locking call offers yet 
another variation. All work fine; 
they are just different. Since In- 
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80 different machines, though, 
we have found that each new 
machine can often pose an ad¬ 
venture in record locking. The 
System V Release 2 standard is 
helping in this regard, however. 

The source code we use to 
do locking has ifdejs for code 
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ASSOCIATES, INC 

WE DON’T 
PRODUCE TRAINING. 

We Produce Shell Programmers, C Programmers, Ada Program¬ 
mers, System Administrators, Kernel Hackers, Doc Preppies, 

and Project Managers. 

•We will work with you to find out what your people need to know. 

• At no charge, we will propose a curriculum tailored so that your people 
are immediately productive. 

• Our instructors will deliver the courses or you can license the courses 
and well teach your teachers. 
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WE ARE ALSO COMMITTED TO BRING TO 
MARKET A LINE OF SOFTWARE TOLLS 
TARGETED AT PROGRAMMER PRODUCTMIY. 

The first of these products is: 

shacc-the shell accelerator-is a compiler for the Bourne shell. It translates Bourne shell pro¬ 
grams into C and then invokes the C compiler to produce an "a.out" file. The C code that 
is generated is well-structured and very readable, so it can be further optimized by hand if 
you like. 

shacc allows you to write production code in'the Bourne shell: Do the fast prototyping in shell 
and then shacc it and ship it. 

Call us for information about our on-line demonstration. 

shacc / 

By Paul Ruel 

Concentric Associates 


SHAGC UP WITH-. 

CONCENTRJC ) 
ASSOCIATES, INC 


For further information on our Educational Services or shacc, call or write: 

Linda Cranston/Concentric Associates, Inc/One Harmon Plaza/Secaucus, NJ 07094 
201-866-2880 
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to handle dozens of different 
locking calls and “non-operating 
system” locking call schemes. 
Schemes that exist outside the 
operating system use some more 
esoteric tricks to work around the 
UNIX kernels that lack locking 
calls. These schemes are not 
needed as much anymore, but 
they have been used in the past, 
and some of the better work¬ 
arounds are still in use. 

For example, UNIX 4.1 and 
4.2BSD versions do not have 
record locking calls per se, but 
4.2 does allow programmers to 
lock a file. A clever programmer 
can develop lock table manage¬ 
ment code to secure records or 
anything that needs to be locked 
quickly. When System V came out 
prior to Release 2, it did not have 
a locking call, but it did provide 
for semaphore calls. This has 
allowed clever programmers to 
lock the lock tables. However, 
with both the 4.2 Jlock call and 
System V semaphores, the lock 
tables have had to be kept in a 
“public place”, such as a disk, 
and performance of updating has 
been affected (although not badly 
because of the least-recently- 
used disk buffering scheme uti¬ 
lized by UNIX). 

For all the improvements, 
some of the machines currently 
on the market still lack a locking 
call. Those that use early versions 
of UNIX typically have no locking 
facilities whatsoever unless the 
manufacturer has added them 
independently. It should be said, 
though, that almost every manu¬ 
facturer with a locking problem is 
actively working on it. Most 
DBMS software in the UNIX mar¬ 
ket thus is safe, though some 
packages still do not provide rec¬ 
ord-level locking. Those that lack 
record locking must change be¬ 
cause even though it is easier to 
lock an entire file, users cannot 
obtain as much freedom and 
“concurrent throughput” that 


way. 

LANGUAGES 

First, there was C under UNIX. 
But for business users, COBOL 
has always ranked number one. 


Happily, COBOL has been suc¬ 
cessfully ported to UNIX by sev¬ 
eral third-party software com¬ 
panies. Better yet, the interactive 
debuggers and “workbench” type 
environments available for CO- 


Easier 

than 

1 - 2 - 3 ... 


BUT DESIGNED 
FOR LARGER 
SYSTEMS 



It’s simple, C-CALC from DSD Corporation is 
more flexible, has more functions, and is easier 
to use than the best selling spreadsheet. We 
made it that way for a very simple reason, you’ll 
get more work done and make better decisions 
in less time. That’s what makes you successful 
whether you are planning for the future, fore¬ 
casting trends, or analyzing profits. 

The most popular spreadsheets require a great 
deal of time to get up and running. When we 
created C-CALC we kept in mind that time is 
your most important resource. Our On-Line 
Help facilities, prompts and menus allow even 
someone with minimal experience to see 
meaningful results in very little time. Our built- 
in training procedures let you pace your own 
learning with tutorial topics that range from 
basic to advanced. As you become more expe¬ 
rienced, C-CALC allows you to bypass 
prompts and menus to save even more time. 


P.0. BOX 2669 
KIRKLAND, WA 98033-0712 


■ EFFECTIVE SOFTWARE FOR BUSINESS 


So call DSD Corporation at (206) 822-2252. 
C-CALC is currently available for: UNIX, VMS, 
RSTS, RSX, IAS, P/OS, AOS, AOS/VS (Data 
General), IBM CSOS. 

C-CALC is a registered trademark of DSD Corporation. UNIX is a registered 
trademark of Bell Labs. P/OS. RSTS and RSX are registered trademarks of 
Digital Equipment Corporation. AOS and AOS/VS are registered trademarks 
of Data General Corporation. 
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BOL programmers under UNIX 
are among the best offered 
anywhere. 

However, the interfaces be¬ 
tween COBOL and DBMS prod¬ 
ucts have not been quite so illus¬ 
trious. COBOL has a file type 
ISAM (Indexed Sequential Access 
Method) that allows programmers 
to declare “search keys” on se¬ 
lected fields of a data file. The 
ISAM file type facility allows pro¬ 
grammers to quickly search for 
records by the contents of fields 
rather than by record number. 

This ISAM file type is typically 
supplied by the operating system, 
largely because it can thus be tied 
in with record-locking mech¬ 
anisms. But since UNIX had 
no record locking when it first 
became commercially available, 
it certainly didn’t offer ISAM. 
Indeed, UNIX programmers could 
not even read records by number, 
which was considered a standard 
capability by most other operat¬ 
ing systems at the time. Instead, 
records had to be read by byte 
position. 

The same suppliers that pro¬ 
vided COBOL under UNIX solved 
these problems by implementing 
fixed-length records on top of the 
byte-pointer style of interface of¬ 
fered by UNIX. Some COBOL 
implementations used C-ISAM 
since they were often written in 
C. 

This use of a standard ISAM for 
UNIX has supplied an interface 
between DBMS tools and COBOL 
(as well as other languages using 
the same ISAM). The COBOL 
CALL verb has also been used 
effectively as a subroutine style 
interface (by both Informix and 
Unify). These interfaces almost 
always have worked better when 
they have come from C rather 
than COBOL, demonstrating the 
UNIX prejudice. But this is now 
changing in response to commer¬ 
cial demands. 

The best technology for inter- 


DBMS software often 
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facing a language with a DBMS is 
not by way of a subroutine at all, 
but rather by using an “embed¬ 
ded query language” approach. 
Until recently, this was used only 
by Ingres (embedded Quel), but 
now it is also available in the 
Informix-SQL product line for C 
and COBOL (embedded SQL). 

The embedded approach al¬ 
lows programmers to interleave 
statements of a query language 
with statements written in a 
high-level language, such as C or 
COBOL. The term “query” lan¬ 
guage is a bit of a misnomer since 
languages such as SQL or Quel 
really have statements for insert, 
delete, update, and structure¬ 
changing operations in addition 
to sophisticated data retrieval 
statements. 

These embedded languages, 
even when used on non-UNIX 
machines, are usually compiled 
by first translating the database 
language into the syntax and 
subroutine calls of the host lan¬ 
guage. This technique, known as 
“running a pre-processor”, is old 
hat in the UNIX community. 

C has always had a pre-proces¬ 
sor, and many of the software 
development tools and program 
text editors on UNIX cope with 
such compile-time architectures 
well. Error messages that come 
from the compiler of the underly¬ 
ing language contain error mes¬ 
sages referencing line numbers 
from the original source listing— 
if the product is set up to do this 
and if it follows the rules of UNIX 
pre-processors. In short, since 


language processing is something 
UNIX has always been good at, 
these types of programming pro¬ 
ductivity tools fit well into the 
UNIX environment. Considering 
that pre-processors offer the best 
interfaces to database systems, it 
could be said that UNIX enjoys a 
significant advantage over com¬ 
peting “commercial” operating 
system environments. 

THE FILE SYSTEM 

UNIX was written to operate on 
machines that had meager pro¬ 
cessing resources by today’s 
standards. Accordingly, it could 
not impose much overhead. It’s 
hardly surprising then that the 
designers produced a system built 
around the assumption of small 
files. This was reflected in design 
tradeoffs that assessed a perfor¬ 
mance penalty whenever files 
were large. 

How did this penalty work? In 
essence, by wreaking havoc with 
the inodes of large files. The inode 
for each file is a block of disk that 
explains who created the file, 
when it was last modified, what 
the permissions are, where to find 
the file on disk, and how big it is. 
The information about size and 
location needs to be fairly concise 
to fit into a single disk block. If the 
file is too big for an inode to 
explain its location fully, the in¬ 
formation in the inode block can 
act simply as a “pointer” to a 
group of other blocks containing 
the full story. This additional step 
means, though, that the oper¬ 
ating system is spending more 
time reading and writing large 
files, because it first has to read 
and study pointers and indirect 
blocks. 

The amount of overhead im¬ 
posed by this “indirection” has 
been much debated. Some have 
estimated that retrievals on sys¬ 
tems with “old” architectures 
can be slowed by as much as 
20 percent. Discussions of this 
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question typically have been 
more theoretical than practical, 
though. And the question is be¬ 
coming more academic still—for 
a number of reasons. 

First of all, the block size on 
most UNIX machines is changing 
from 512 bytes to IK, 2K, or even 
4K bytes. The bigger the ma¬ 
chine, the bigger the block size— 
and, in more cases than not, the 
bigger the database files. Also, 
the bigger the block size, the 
bigger the file can be without 
“overflowing” its inode. As block 
sizes continue to grow, it becomes 
less likely that overflow will occur 
at all. 

Also, the 4.2BSD implementa¬ 
tion contains a “fast” file system 
that has been reorganized to 
diminish or eliminate this prob¬ 
lem. Although 4.2BSD is not 
widespread commercially, many 
of its concepts for building a file 
system capable of handling large 
files have been picked up by 
commercial implementors. 

Another reason for diminish¬ 
ing concerns over inode overflow 
is the falling price of main mem¬ 
ory. This may not seem relevant, 
but because of the elegant buffer¬ 
ing scheme of UNIX, it is. Even 
where indirection is necessary, 
indirect blocks can easily be lo¬ 
cated in main memory to facili¬ 
tate frequent use. Users with big 
database files who must work 
with antiquated file system archi¬ 
tectures and small machines can 
get around the overhead by in¬ 
creasing the number of operating 
system file buffers whenever per¬ 
formance problems become no¬ 
ticeable. The default number of 
buffers is usually very low—often 
30 (each only 512 bytes long, 
giving the system a total of 16K 
bytes for file system buffering). 
This could be enlarged to 50, 75, 
or even 100 buffers, and still 
be conservative. Increasing the 
number helps ensure that the 
indirect pages, if any, remain in 
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main memory. 

Some DBMS vendors have de¬ 
veloped file systems able to “work 
around” UNIX in order to address 
the indirection problem. These 
solutions usually require that the 
database be placed on what is 
called a “raw disk” logical or 
physical device. The data is then 
written by the DBMS software to 
an “unmounted device”, obviat¬ 
ing the need for formatting it for 
the UNIX file system. 

The disadvantage to this ap¬ 
proach is that UNIX doesn’t know 
how to cope with these “raw 
disk” chunks, and thus many 
UNIX utilities—such as Is, tar, 
df, icheck, dcheck, fsck, and the 
like—are of no use. Also, other 
programs cannot share space on 
the disk allocated to a DBMS file 
system. This means that any 
space set aside for future DBMS 
growth translates by necessity 
into wasted system space for the 
present. Nevertheless, those us¬ 
ing the “work-around” approach 
are best advised to reserve plenty 
of space at the outset since major 
problems are sure to arise if the 
space should ever fill. In the event 
the space does become saturated, 
the entire device will have to be 
unloaded to tape, reconfigured as 
a larger device (using some of the 
more esoteric UNIX utilities), and 
then reloaded. This process also 
usually means that some other 
logical disk device will have to be 
borrowed from to obtain the nec¬ 


essary space. This means other 
data will have to be unloaded as 
well. Databases that span disks 
are more complicated yet, since 
UNIX pathnames are not avail¬ 
able for non-UNIX file structures 
and thus the process cannot be 
made transparent. 

MULTIPROCESSING 
ARCHITECTURE AND 
SHARED MEMORY 

“Commercial” data processing 
has traditionally allotted special 
privileges to DBMS code. In fact, 
DBMS software often looks like 
part of the operating system. At 
times, it actually is. 

Why did this come to be? 
Mainframe DBMS code generally 
has been very big (.5 to 2 MB, 
depending on the product), and 
typically has had to be able to 
handle several programs at the 
same time. It also has tradition¬ 
ally wanted to buffer lots of data, 
just like an operating system’s 
file system. 

Because of these design issues, 
database systems usually have 
been implemented with front- 
end/backend architectures. The 
“backends” have typically con¬ 
tained DBMS software machines 
capable of performing fast in¬ 
dexed retrievals, storing new 
data, and maintaining indices. 
The front-ends either have of¬ 
fered user programs that were 
linked to DBMS software through 
a programming language inter¬ 
face library, or collections of 
DBMS tools like interactive query 
languages, screen packages, or 
report writers. 

Is the front-end/backend ar¬ 
chitecture essential? Can UNIX 
provide it? The answers depend 
chiefly on the size of the machine. 
Small machines usually don’t 
have many users and, as a result, 
don't need the two-process archi¬ 
tecture as much as larger ma¬ 
chines. In fact, the overhead of 
Continued to Page 98 
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An interview with Peter Weinberger 


Within AT&T Bell Laborato¬ 
ries, Peter Weinberger is known 
for many things. One is his 
knowledge of how databases 
work under UNIX. 

As head of Computer Sys¬ 
tems Research at the Labs, 
Weinberger is concerned with 
more than just data manage¬ 
ment. But that nevertheless is 
where he made his mark nearly 
10 years ago when he joined 
with Alfred Aho and Brian Ker- 
nighan to develop awk, still one 
of the most powerful data ma¬ 
nipulation tools available un¬ 
der UNIX. (The command name 
itself stands for "Aho, Wein¬ 
berger, and Kernighan"—a 
classic example of just how 
obtuse UNIX mnemonics can 
be.) Since those days, Wein¬ 
berger's database interests 
have branched into new direc¬ 
tions, as evidenced by a paper 
he published in the November, 
1982, issue of the Bell Sys¬ 
tem Technical Journal, "Making 
UNIX Operating Systems Safe 
for Databases". 

He claims his focus is now set 
elsewhere, but the insights that 
lace his responses to the ques¬ 


tions that follow would indicate 
otherwise. Contributing Editor 
Ned Peirce, a consulting sys¬ 
tems analyst, delivered the 
queries. 

REVIEW: How well does the 
UNIX system support da¬ 
tabases? 

WEINBERGER: There are two 
issues that come to mind. First 
of all, there is the commercial 
issue. I do not really understand 
the commercial issues and the 
ground rules for discussing what 
you need in various types of 
products. That has a lot to do with 
marketing, and I explicitly dis¬ 
claim any understanding of 
marketing. 

The other question is: what 
would it take to make UNIX 
suitable for various kinds of data¬ 
bases—small databases, big da¬ 
tabases, United Airlines reser¬ 
vation systems, and all the rest of 
it? This is the question that 
interests me. Does UNIX require 
big changes to handle database 
systems? Here, “database sys¬ 
tems” roughly means conven¬ 
tional transaction processing. 
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sary, since it could almost be 
replaced by cat. 

REVIEW: I am told that you 
have some interesting insights 
into the problems database sys¬ 
tem designersjaced during the 
development oj the United Air¬ 
lines reservation system. 

WEINBERGER: No special in¬ 
sight. What I heard was that 
though they always bought the 
biggest and fastest computers, 
they still wanted more. They 
weren’t so worried about reliabil¬ 
ity, though. 

REVIEW: They had a capacity 
problem? 

WEINBERGER: A real capacity 
problem. If you think of transac¬ 
tion rates measured in hundreds 
per second, which they were, and 
allow yourself a little growth 
while you do your research, you 
find yourself considering transac¬ 
tion rates of thousands per sec¬ 
ond. This starts to get pretty 
tricky. If I want to go to disk, I 
know it used to take 30 msec to 
get a block off the disk. But if I 
assume there’s been improve¬ 
ment, and that the time it takes to 
get a block off the disk might go 
down to 10 msec... 

REVIEW: It won't leave much 
time to do anything. 

WEINBERGER: That’s right. You 
need lots of disks so that you can 
get all these things done in time. 

REVIEW: Or run the risk of 
losing data? 

WEINBERGER: Right. It’s a real 
problem. Now that memory is 

getting cheaper you might be able 
to keep much of the database in 
memory. 



into reliability problems? 


WEINBERGER: You have to de¬ 
cide what you’re going to do about 
logging, and what you’re going to 
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do about backups, and what 
you’re going to do about memory 
card failures. I’m not sure you 
have to actually do anything 
about memory card failures, but 
it is one of the questions that 
arises. After having decided all 
that, you then get to decide what 
you're going to do about access 
structures on the disk. You can 
use B-trees or some kind of clever 
hashing scheme that matches the 
fact that you copy in disk blocks 
as the fundamental unit of ac¬ 
cess. But, that’s not what’s going 
on in memory, right? Memory can 
get at individual words equally 
fast, so you can try to change the 
access structures or you can for¬ 
get about it; it may not make any 
difference. I don’t think it’s all 
that hard. It is an intriguing way 
of getting more performance, 
though. Just put a few hundred 
megabytes of memory on your 
machine and move large parts of 
the database into memory and see 

what happens. 


REVIEW: Wouldn't you need 
some scheme for reading the 


I (01 



medium? 

WEINBERGER: I don’t think you 
should leave it to good fortune. 
You would need to have a piece of 


the algorithm write log records. 1 
think you could write whatever 
you would have normally written 
had the data been on disk. 

You also need ways to handle 
the kinds of high transaction 
rates you can expect when you get 
lots of people typing and submit¬ 
ting transactions at the same 
time. In that situation, the system 
won’t respond to transactions 
right away when it processes 
them. It won’t respond until it has 
batched up all the little transac¬ 
tion records, written the log, and 
written the whole thing out to 
disk. Only then will it respond to 
all those people. If you’re doing 
1000 transactions a second, you 
can write very large log records 
because you can do them 100 at 
a time. You can do that because 
you only have to write 10 large 
blocks a second, which is easier 
than 100 or 1000. A delay of a 
tenth of a second in a response is 
barely perceptible. So the scheme 
should work okay. There are a lot 
of little questions and details, 
though. It sure wouldn’t look 
much like a timesharing system. 

REVIEW: Would it look like a 
UNIX system? 

WEINBERGER: It could look like 
a UNIX system, but you would 
never actually see the UNIX sys¬ 
tem itself. Let’s say you turn on 
your machine for the first time 
and then load up all this software 
that eats up all your memory. 
That would not encourage people 
to do a lot of software develop¬ 
ment or whatever on the system. 

That’s to say that for all that 
work, you will not have solved 

any of your problems except lor 

speed. All the problems of inde¬ 
pendent transactions, safety, and 
deadlock still have to be dealt 



still there. 


REVIEW: And all that requires 
hacking in the kernel? 

Continued to Page 100 
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WEINBERGER INTERVIEW 



REVIEW: How would you go 
about putting a database sys¬ 
tem together under the UNIX 
system? 

WEINBERGER: There seem to be 
three approaches—or some num¬ 
ber like three. People who build 
database systems under UNIX 
can build them without using the 
UNIX file system; that is, they can 
take a raw disk and use some 
scheme like shared memory to 
make buffering to the disk more 
efficient and easily controlled by 
the database management sys¬ 
tem. This means, though, that 
they have to implement their own 
concurrency control mechanisms 
to make sure transactions don’t 
stomp on each other in unaccep¬ 
table ways. That’s been done in a 
fair number of products sent out 
by Bell Labs to operating phone 
companies. The people who keep 
track of cable repair work have 
database systems of this sort. 
These are transaction processing 
systems that are at least partly 
run on UNIX. They just use a raw 
disk scheme to take control with¬ 
out making big changes in the 
kernel. 

Another approach is to go 
whole hog and get out the data¬ 
base literature, read through all 
the concurrent control stuff, and 
say, “Well, we’ve just gotta put all 
that stuff in the kernel.” That 
would be fairly remarkable, but 
it’s sort of possible. 

The question is: might there be 
an intermediate approach that 
could give you control where you 
absolutely need it without distort¬ 
ing the underlying image of UNIX 
too much? 

REVIEW: And in that way see 
to it that the kernel remains 
useful to other people? 

WEINBERGER: That’s right. You 
don’t want to distort the kernel 
too much. It’s fairly plastic, but if 
you stretch it a lot, it gets sort of 
rigid. You want to avoid that. But 


solutions are still more or less a 
research question. My view is 
that putting a database system 
together under UNIX is possible, 
but that a couple of problems will 
have to be solved along the way. 
Some of those problems are hard 
and some are easy. 

The easy part is the control of 
stuff in the system—making sure 
that disk buffers have gone onto 
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disk, so that if the system crash¬ 
es, you’ll know what the state 
was—that sort of thing. That’s 
all fairly trivial. 

REVIEW: Ordered rights , that 
type of thing? 

WEINBERGER: Yes, and if neces¬ 
sary, a special system call that 
flushes out all the disk buffers 
associated with a specific file. The 
conventional image of a database 
system is that a single process is 
in charge that somehow keeps 
track of everything. The classic 
UNIX system approach to it would 
be that roughly each terminal, 
each person, each transaction 
would have a process, and then 
somehow you would have to get 
the processes to communicate 
among themselves to resolve 
deadlocks. Apparently, one of the 
most efficient ways of controlling 
concurrency in databases is to 
lock shared resources. 

REVIEW: If something's locked, 
do you just keep retrying until 
it's unlocked? 

WEINBERGER: You have differ¬ 
ent choices there. The real ques¬ 
tion, though, is: what happens 


when you get a lock on something 
and somebody else wants a lock 
on it as well. They can wait for a 
small amount of time and try 
again, or they can somehow regis¬ 
ter a request that says, “Let me 
know when the lock becomes 
free.” 

REVIEW: And then sleep on it? 

WEINBERGER: Lots of commer¬ 
cial systems have these things 
that allow you to enqueue on a 
lock and then somehow get ser¬ 
viced later. There are two parts to 
this. One is that if you have a 
database system in which there 
are lots of lock conflicts, you 
probably have an inefficient sys¬ 
tem. Presumably, though, you 
have a system in which you get 
very few of these conflicts. If so, 
you probably hate to have to go to 
the operating system each time 
you want to get a lock because 
that takes 500 microseconds 
or whatever. One good place to 
put locks in order to minimize 
such objections is in shared mem¬ 
ory. That way, when there’s no 
conflict, you can use some kind 
of atomic memory operator to 
grab things and lock them. That 
works perfectly well when there’s 
no conflict. When there is a 
conflict, you have two indepen¬ 
dent processes that have to talk to 
each other somehow, and that’s 
through the operating system. 
The question is: how little can you 
put in the kernel to support 
locking and deadlock detection? 
It’s probably some fairly small 
amount. 

REVIEW: We're talking about 
changes above and beyond 
shared memory and record 
locking. Are you saying that 
those alone are , to some peo¬ 
ple's minds , insufficient? 

WEINBERGER: They are insuffi¬ 
cient, precisely because of this 
problem. Say I’m going to move 
money from my account to your 
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account, and you’re going to move 
money from your account to her 
account, and she’s going to move 
money from her account to mine. 
Let’s say we all start at once. We 
each lock our accounts because 
we’re about to change them. 
Then we all go out and try to grab 
the other accounts because we’re 
going to change them too. Now 
we’re deadlocked. Somebody’s 
got to discover this. And, when 
they do, we have to have some 
way of dealing with it. 

REVIEW: Don't you have a se¬ 
curity issue if every process can 
detect deadlocks? 

WEINBERGER: That’s why you 
want the kernel involved. That’s 
where all the coordination comes 
from. There are other ways of 
doing it, though. 

REVIEW: By creating a large 
process that can keep track of 
all the locks? 

WEINBERGER: You can have one 
process that’s in charge of lock¬ 
ing, lock resolution, and some 
kind of interprocess communica¬ 
tion. The trouble with that is that 
bad things can happen. If I lock 
my stuff and ask to lock you, and 
then I divide by zero and die; 
there’s no point in ever trying to 
talk to me again. The process in 
charge of locks needs a way of 
deciding the process has gone 
away, but some forms of interpro¬ 
cess communication don’t pro¬ 
vide that. Some forms might, but 
some forms don’t. The kernel, 
though, always knows. 

REVIEW: It also runs at a very 
high priority. 

WEINBERGER: That’s not as im¬ 
portant as knowing it all. The 
hope is that these funny dead¬ 
locks are quite rare. 

REVIEW: You mean you hope 
things occur promptly enough? 

WEINBERGER: The hypothesis 


is that locks don’t conflict very 
often, and that it requires three 
conflicts, roughly, to get a dead¬ 
lock. That means deadlocks 
should be quite rare. 

REVIEW: It's hard to prepare 
for those instances since they 
depend on the size of the data¬ 
base and the way it is used. 

WEINBERGER: Yes. That’s one 



of the dilemmas the poor data¬ 
base administrator is supposed to 
keep up with. You know how it 
is—any time you don’t know how 
to deal with something automati¬ 
cally, you say, ”Oh well, it’s up to 
the database administrator to 
handle that.” The database ad¬ 
ministrator is supposed to know 
all about how the database 
works, and if it’s not working very 
well, then of course the adminis¬ 
trator gets stuck with fixing it. 

Before saying how, we should 
digress slightly to talk about an 
argument concerning this busi¬ 
ness of detecting deadlocks. If you 
look for deadlocks too often, you’ll 
end up spending a lot of time 
looking. But if you look for dead¬ 
locks infrequently, things might 
stay deadlocked for long time. 
This gives rise to some kind of 


cost/benefit curve. If you plot the 
two crossing curves, you’ll find an 
optimal point in the middle. Jim 
Gray was the one who made this 
observation originally. 

There’s a lot of detail known 
about low-level implementations 
of database systems. You just 
have to decide how much of that 
kind of stuff to use. Implementing 
a flexible transaction processing 
database system nowadays is a 
lot of work. You have to get at the 
low-level stuff—which I find in¬ 
teresting—and then you have to 
get all the user interface, validity 
checking, bulk loading, forms 
package, and terminal handling 
stuff together. 

REVIEW: I suppose this sort of 
work is being done over and 
over again, even within the 
Labs. 

WEINBERGER: Yes. Whenever I 
have had a project, I’ve never felt I 
was building a database sys¬ 
tem—I’ve felt as if I was working 
on a project. It might be that the 
project needed a database system 
because perhaps there was none 
available that was quite suited to 
my needs. There are a lot of these 
tradeoffs to make, and I want to 
make the ones that favor my 
world. That means that when the 
next person needs a database 
system, they might discover that 
my design is a little too special- 
purpose to suit their needs. So 
they’ll just have to do another 
one. 

REVIEW: It's also true that 
those other people might never 
have heard of your database 
system. 

WEINBERGER: That’s also quite 
possible. But there are relatively 
few systems that have the sort of 
transaction processing and reli¬ 
ability you might want in the 
event of a crash. 

REVIEW: Does the limited num¬ 
ber of systems relate to difficul- 
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ties with UNIX? 

WEINBERGER: Not really. It’s 
more of a reflection of the fact 
that it takes a lot of work to build 
one of these things. If you don’t 
need it, why bother? 

REVIEW: Is it appropriate for 
the UNIX system to provide 
database facilities as an 
option? 

WEINBERGER: The question be¬ 
comes: are you going to provide 
the whole thing as a single unit, 
or are you going to provide pieces 
only. 

REVIEW: The UNIX system is 
like a toolkit now. 

WEINBERGER: That’s my feel¬ 
ing. Separate the two things—the 
pieces that you would definitely 
want to put in the system and the 
stuff that’s sort of high level, like 
the real database systems. 

REVIEW: Would that mean you 
could use whatever forms pack¬ 
age you wanted? 

WEINBERGER: Well, you might 
provide one of those also, but you 
would be able to separate every¬ 
thing into the two types of 
offerings. 

REVIEW: If you're trying to de¬ 
sign a series of database tools 
that would be generic to UNIX , 
would you try to adhere to some 
particular query language. like 
IDL, or do you think that whole 
issue needs to be re-thought? 

WEINBERGER: I do not know 
enough to answer that question. 
There are some query languages 
that seem appealing, either be¬ 
cause they’re widely used, or 
because they’re supposed to be 
very easy to use. Query By Exam¬ 
ple and these other sorts of lan¬ 
guages for filling out forms are 
often talked about. Query lan¬ 
guages and transaction process¬ 
ing are really two quite different 
things, though. 
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REVIEW: What kinds of data¬ 
base things can be done on the 
UNIX system as it stands? 

WEINBERGER: The UNIX sys¬ 
tem, as opposed to a lot of the 
systems on which these things 
were first built, already comes 
with a bunch of commands that 
you can use to search text files. 

REVIEW: Such as awk? 

WEINBERGER: Sure, awk, grep, 

or whatever. 

REVIEW: Did you design awk to 
be easy to use? 

WEINBERGER: As I remember, 
when we designed it, the idea was 
to make a lot of the syntax like C 
so that other people in the center 
wouldn’t have to learn much new 
stuff. It wasn’t designed to be 
particularly easy to use. 

A project called POPLAR was 


underway at Xerox PARC at 
roughly the same time. Their 
goals were moderately similar to 
ours, but in many ways more 
ambitious. It is not a very fair 
comparison. The Xerox guys were 
producing a functional language 
that was fancy and had a couple 
of very cute ideas in it. It was also 
designed to be easy to use. But the 
end result was that lots of people 
use awk and nobody uses 
POPLAR. 

REVIEW: Strange. 

WEINBERGER: Life is strange. I 
think it had more to do with the 
environment the programs were 
run in. It’s easier to use stuff in 
UNIX—it fits together better. The 
stuff in their environment at the 
time was harder to use. There 
was no easy way, for instance, of 
joining independent programs 
together. 

REVIEW: Do you think tools 
like sort and awk became popu¬ 
lar because the UNIX system 
became so popular? 

WEINBERGER: Yes, but I think it 
goes both ways. The fact that 
UNIX came with a bunch of useful 
little tools is one of the reasons it 
got popular. 

REVIEW: What are the limita¬ 
tions of using UNIX tools like 
grep, awk, sort, and the like? 

WEINBERGER: First of all, they 
work on text files only. It is 
possible to build databases solely 
out of text files since you can keep 
the data in text files and have 
special-purpose index trees ac¬ 
cess the data. You can also have 
tools that generate the indexes. 

Many years ago, when there 
was a big debate about how hard 
it was to build relational database 
systems, I built a toy relational 
database system as a demonstra¬ 
tion. It was made up of ordinary 
text files and it built its indexes so 
that programs, when given an 
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UNIX SYSTEM V. SOFTWARE THAT STACKS UR 


High-quality application software written 
for UNIX System V is available now. And 
new business programs are continually 
being developed. 

That means increased market oppor¬ 
tunities for VARs and ISVs. And end 
users can be certain their investment in a 
UNIX System V-based computing sys¬ 
tem is a smart, sound one. 

Quantity and quality 

The market for products that are based on 
UNIX System V has grown significantly 
over the last two years. There are now 
hundreds of packages written for UNIX 
System V. And this growth will continue. 

Quality applications—general busi¬ 
ness packages and development tools— 
are now available for the AT&T UNIX PC. 
And AT&T is putting together even more 
hardworking, industry-specific software 


to run on AT&T 3B2 and 3B5 Computers. 

With so many packages available, 
VARs can offer more turn key systems 
based on UNIX System V to an ever¬ 
growing customer base. ISVs can sell 
software to even more customers. And 
end users can invest in a UNIX System 
confident there is plenty of quality 
software available. 

Software you can bank on 
UNIX System V has gained acceptance as 
a powerful, versatile computing standard. 
More hardware vendors like NCR, Altos, 
Motorola, Perkin-Elmer and Sperry are 
joining AT&T in offering products based on 
UNIX System V. 

And weVe introduced the System V 
Interface Definition. Software written 
under the Interface Definition can run on 
current and future releases of AT&T’s 


UNIX System V, as well as various Sys¬ 
tem V derivatives offered by AT&T 
licensees. So there will be an even larger, 
more comprehensive base of portable 
software from which to choose. 

Our comprehensive UNIX System V 
Software Catalog lists a full range of pack¬ 
ages that run under UNIX System V. For 
end users it’s a reference guide to the 
programs available. And for developers it’s 
a smart way to ensure packages will have 
even greater exposure to the growing 
UNIX System V market. 

To learn more about UNIX System V 
market opportunities, order the UNIX 
System V Software Catalog—at $19.95 
plus tax. Call 1-800-432-6600 and ask for 
Operator 397. 

UNIX System V. From AT&T. 

Consider it standard. 



AT&T 

The right choice. 


© 1985 AT&T Information Systems. 































































































































WEINBERGER INTERVIEW 


index, could produce all their 
records in index order. Because of 
that, you could run grep or sort 
on that data and then put it back 
in the database. It wasn’t a won¬ 
derful system, but it worked. 

REVIEW: How absolute are the 
limitations of text files? If you 
treat some of the text as num¬ 
bers, what do you really lose? Is 
conversion going to slow you 
down a great deal? 

WEINBERGER: If you keep your 
database in text files that are 
indexed by separate files, you 
have to be careful that people 
don’t change the text files without 
changing the index files. Where¬ 
as, a database system that takes 
your data and squirrels it away 
itself gives you no choice but to 
use database programs to get at 
the data. The other problem is: if 1 
do a lot of updates on my text file, 
it’s likely that I will keep adding 
stuff at the end. That means that 
if I’m scanning the text file. I’m 
going to detect a lot of records that 
aren’t there anymore. That’s go¬ 
ing to make work a little inconve¬ 
nient, so I’m going to have to do 
something about that. With a bit 
of discipline, though, there’s no 
particular reason why this sort of 
scheme shouldn’t work fine. It 
doesn’t quite seem to fit into good 
retrievals, though. 

There is another interesting 
question concerning efficiency 
where I’m on both sides. If I have 
a database I want to deal with 
fairly efficiently that contains 
things like numbers, is it better to 
store the numbers in ASCII as a 
text file, or should I store the 
numbers in binary so I don’t have 
to convert them? A group at 
Whippany [the Whippany, NJ, 
installation of Bell Laboratories] 
decided that they had been forced 
to convert their project once too 
often. They said, “This is silly. 
Why not give up 15 percent of our 
performance and just keep the 



stuff in ASCII?’’ The performance 
hit was only that big because of 
the work their databases did. If 
their typical retrieval had been, 
“What is the average salary at 
Bell Labs?”, switching over to 
ASCII would have been expen¬ 
sive, but that wasn’t the case. 

REVIEW: They were not doing 
conversions during searches? 

WEINBERGER: That’s right. A 
lot of the work in a transaction 
database system goes into finding 
records and saving logs. A lot of 
CPU cycles go into that, but 
converting numbers isn’t all that 
slow. That’s one argument in 
favor of keeping your database in 
a sort of “texty” format. 

REVIEW: I imagine it's also 
easier to fiddle with the data in 
text format. 

WEINBERGER: You want to a- 
void fiddling with it, though, be¬ 
cause you don’t know how many 
access paths exist or how many 
ways there are for looking things 
up. If I build a static index of some 
sort to keep track of data, and 
then you go and fiddle with the 
data, you’re probably going to 
mess up the index, which would 
be quite embarrassing. 


REVIEW: What are the real ad¬ 
vantages of going to text? What 
made the Whippany people 
want to stay with it? 

WEINBERGER: It is so much 
more portable. 

REVIEW: They didn't have to 
be concerned about byte 
ordering? 

WEINBERGER: That was their 
problem. They got into this either 
because they were converting 
their systems or because a data¬ 
base that was being built over¬ 
night by an operating company 
on an IBM machine was going to 
be moved to some UNIX machines 
in the morning. There was just no 
question about the portability of 
text on the whole. So long as you 
kept your character set under 
some sort of control, most of the 
translation was well understood. 

REVIEW: There are, in fact, 
UNIX tools available for typical 
conversions. 

WEINBERGER: Right. The bit¬ 
fiddling stuff may be worthwhile 
on a big project, but writing your 
own conversion programs for a 
small project only adds complex¬ 
ity. If I want to replace some 
11/70s with 3B20s in a project 
that stores binary data in files, I 
can’t just slide one machine 
out—I have to slide them all out. 

One of the things that’s not 
very modern sounding, insofar as 
type checking is modern, is the 
fact that the common intermedi¬ 
ate file format in the UNIX system 
is basically a text format. Pipes 
connect text files and make it all 
wonderful. 

There are a few things that 
aren’t stored in text files, prob¬ 
ably for reasons of efficiency. 
Even in a couple of those cases, 
though, you could probably store 
things in text files just as easily. 
Take directories, for instance; 
they probably don’t need any 
binary. It would make Is unneces- 
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Backend machines for 
data management 

by Paula Hawthorn 


• KMCfc) 


PUTTING 
DATABASE 
WORK IN 
ITS PLACE 


A database machine is a spe¬ 
cial-purpose computing system 
dedicated to the management of 
data. When used as a backend 
processor, it can provide signifi¬ 
cant performance advantages. 
This article explores how. 

The generic term “backend 
machine” refers to a computing 
system designed to work in tan¬ 
dem with a “front-end” com¬ 
puter. While the front-end 
handles general computing, in¬ 
cluding the management of the 
user interface, the backend pro¬ 
cesses those commands sent to it 
by the front-end. The machine’s 
general-purpose processor thus is 
freed to perform other work. The 
two major commercial incarna¬ 
tions of backend machines are 
database machines and array 
processors. 

Two Rales. Backend machines 
are developed when: Rule 1) a 
task that consistently consumes 
significant general-purpose com¬ 
puter resources comes to be re¬ 
garded as a major source of per¬ 
formance problems, and Rale 2) 
the task can be executed indepen¬ 
dently with a minimum of data 
and control structure communi¬ 
cation. Both conditions are satis¬ 
fied by array processors and data¬ 
base machines. 

Array Processors. An array 
processor is a backend machine 
that accepts data (such as a 
matrix) and performs, indepen¬ 
dently of the front-end, some 
function on the data (such as an 
inversion). The changed data can 
then be returned to the front-end. 
Array processors are typically 
attached to computer systems 
where matrix manipulation func¬ 
tions consume a heavy percent¬ 
age of CPU time and take a long 
time to complete. An array pro¬ 
cessor can benefit a front-end 
computer by offloading it, thus 
making the CPU available for 
other tasks. In addition, an array 
processor can run faster than a 
front-end computer because it 
contains special hardware and 
software for array manipulation. 

Database Machines .The data¬ 
base management systems that 
are functionally complete — those 

UNIX REVIEW AUGUST 1985 51 

Illustration by Hynn Kim 




BACKEND MACHINES 



providing multi-table commands, 
transaction management, and 
protection—also require a signifi¬ 
cant amount of the processing 
power afforded by any computing 
system. A functionally complete 
database management system 
(DBMS) can contain more lines of 
code and be more complex than 
many operating systems because 
it must not only manage re¬ 
sources but also attempt to opti¬ 
mize high-level user commands. 
Several users running concur¬ 
rent, complex queries on a single 
computer can cause all other 
users to scream with frustration 
because the tricks that the DBMS 
can play to optimize database 
response time (such as locking 
buffers in memory, monopolizing 
disks with long I/O exchanges, 
and more) are just the sort of 
actions that can drag down re¬ 
sponse time for other users. A 
DBMS, then, qualifies for a back¬ 
end machine of its own if only 
because it satisfies the first rule 
by placing an onerous burden on 
the general-purpose front-end 
processor. 

The second rule, which re¬ 
quires that the backend act inde¬ 
pendently of the front-end in 
order to keep the two from spend¬ 
ing all their time exchanging data 
and control information, is satis¬ 
fied only if the relational database 
model is used. Commands in the 
relational model are at a high 
level where information and data 
are exchanged in sets of values. 

An additional reason that rela¬ 
tional data management systems 
are good candidates for executing 
in a backend machine is that they 
make it possible to pose and enter 
very complex, resource-eating 
commands instantly. In a naviga- 
tional-oriented data model, these 
commands might take weeks to 
program. The relational model 
thus greatly benefits the person 
who needs to run a query, but 
distresses other users who at¬ 


tempt to use the computer for, 
say, editing. The solution? Send 
these complex queries to the 
backend machine, where they 
can be run independently of the 
front-end system. 

FUNCTIONAL 

DIFFERENTIATIONS 

A “functionally differentiated” 
multiprocessor system is one 
where each of the processors are 
limited to permanently assigned 
tasks. By assigning the proces- 


It seems 

counterproductive to 
weigh down general- 
purpose operating 
systems with a large 
group of specialized 
commands. 


sors their tasks at system design 
time, control functions (those 
instructions that tell the pro¬ 
cessors what to do) can be min¬ 
imized. 

Functional differentiation does 
work well in a backend system 
because the front-end can be 
used for general purpose process¬ 
ing, while only those functions 
the backend is designed to handle 
need be sent there. The backend 
can then be designed exclusively 
for a narrow set of tasks. Data¬ 
base machines and array proces¬ 
sors both exemplify functionally 
differentiated machines. 

To further discuss database 
machines, I’m going to use Brit¬ 
ton Lee’s Intelligent Database 
Machine (IDM). I could just as 


well have used the Teradata 
machine [1], which is another 
commercial machine, or one of 
several academic machines (see 
bibliography), but the one I am 
most familiar with is the IDM. 

Host Functions. A block dia¬ 
gram of the IDM is shown in 
Figure 1. In it, multiple users are 
connected to front-end systems, 
which could be single-user per¬ 
sonal computers or more power¬ 
ful multiuser systems. From the 
DBMS user’s perspective, the 
chief task of the front-end is to 
parse the user’s database com¬ 
mand and send it to the IDM. This 
is performed in the “host” or 
front-end box, where we find the 
“host interface’’. Here resides the 
program that translates the com¬ 
mands to the parse trees that are 
then sent on to the IDM. 

Thus, in the functional differ¬ 
entiation model, we see that the 
host is committed to receiving 
and translating commands. The 
database machine is committed 
to handling all transaction man¬ 
agement, recovery, security, and 
protection issues surrounding the 
commands, as well as executing 
them. 

The interface to the database 
machine could have been differ¬ 
ent: in an early, important re¬ 
search project (the Relational As¬ 
sociative Processor [2]), partial 
queries were sent across to the 
database machine. In concept, 
one could also send direct user 
input to the machine, but by 
waiting until full queries can be 
syntactically verified, user mis¬ 
takes can be screened out. There 
are other advantages to splitting 
the multiprocessors: one comput¬ 
ing system (the front-end) is left to 
concentrate on the screen man¬ 
agement and query definition to 
the point that standard database 
commands are formed. Then the 
standard commands can be acted 
on by a machine that has been 
explicitly designed for that pur- 
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pose. This split (parsing in the 
front-end, execution in the back¬ 
end) was first used by DeWitt on 
the Wisconsin database machine, 
DIRECT [3]. 

Channel Functions. Figure 1 
shows that data commands enter 
the backend system via a chan¬ 
nel. This represents another op¬ 
portunity for functional differen¬ 
tiation: each channel has a 
dedicated processor for handling 
communication between the out¬ 
side world and the database ma¬ 
chine. Services for buffering, pro¬ 
tocols, and other functions can 
thus be offloaded from the data¬ 
base machine. The channel sim¬ 
ply notifies the main processor 
after it receives a command that 
there is something new to do. 

Processor Functions. The IDM 
has two processors dedicated to 
command execution: a general- 
purpose processor (a Z8000), and 
a Britton Lee-designed processor 
named the Database Accelerator 
(DAC) [4]. The DAC is an 8 mips 
reduced instruction set machine 
focused on data management. It 
can be called as a co-processor by 
the general-purpose processor via 
subroutine calls. 

The IDM has two processors 


for general command execution 
because it was decided that to 
handle up to 16 disks and still 
stay within bus/shared memory 
bandwidth limitations of the ar¬ 
chitecture, two would be enough. 
This particular architecture was 
designed around a target system, 
a target customer base, and a 
target timeframe for a deliver¬ 
able, reliable system (see Design 
Decisions , [5]). A number of other 
approaches might be equally well 
suited to different market par¬ 
ticulars. For instance, the Tera- 
data machine contains more pro¬ 
cessors to support a greater 
aggregate processing power. It 
offers a large number of proces¬ 
sors arranged in a tree-intercon¬ 
nect pattern and is correspond¬ 
ingly more complex than the IDM. 

The functional differentiation 
of the processors within a back¬ 
end machine comes from two 
sources: the hardware (as in the 
DAC), and the specialized operat¬ 
ing system selected to serve a 
dedicated environment. 

Operating System Functions. 
In a dedicated machine, optimiz¬ 
ations can be utilized in a way 
that would be difficult in a gener¬ 
al-purpose environment. For in¬ 


stance, in the IDM, the only 
function of the backend machine 
is data management. Therefore, 
at every possible point, the data 
management operating system is 
cognizant of the special environ¬ 
ment in which it is acting. For 
instance, the buffers are man¬ 
aged according to a scheme that 
recognizes types of buffers, and 
thus knows which ones need to 
stay in memory longer; the pro¬ 
cess manager shows a special 
sensitivity to data management 
by taking care not to schedule 
out processes before they are 
done with their buffers; and the 
disk allocation algorithm allo¬ 
cates space in a manner appropri¬ 
ate to databases. 

Much has been written about 
the problems that general-pur¬ 
pose operating systems give data 
management systems [6, 7, 8], 
but the point is not that general- 
purpose operating systems are so 
bad, but that functional differen¬ 
tiation is so good. In the IDM, 
the operating system and the 
data-management system were 
written as a single entity. There 
are occasional attempts to in¬ 
clude more specialized operations 
in general operating systems so 
as to make data-management 
systems run more efficiently. But 
it seems counterproductive to 
weigh down general-purpose op¬ 
erating systems with a large 
group of specialized commands. 
My view is that data management 
should be carried out on a ma¬ 
chine designed especially for it— 
and that general-purpose operat¬ 
ing systems servicing general- 
purpose machines should be kept 
relatively simple. 

Controller and Disk Func¬ 
tions. The IDM also includes con¬ 
trollers, each with its own pro¬ 
cessor. Ordinary, commercially 
available disks are used to store 
the data that the IDM controls. 
This distinguishes it from the 
British database machine, CAFS 



Figure 1 — Block diagram of the Britton Lee Intelligent Database 
Machine (IDM), exhibiting functional differentiation in a backend system. 
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fo • rum, n. (pi. FORUMS) 

1. A public meeting place for 
open discussion. 2. A medium (as 
a newspaper) of open discussion 
or expression of ideas. 3. A pub¬ 
lic meeting or lecture involving 
audience discussion. 4. A program 
involving discussion of a problem 
by several authorities. 



^ Forum 

lb,. I In I"— 



*eForum designed by Marcus Watts, Copyright 1984, 
Network Technologies International, Inc. (NETI). 


Electronic meetings continue the 
automation of knowledge transfer which 
started with electronic mail. 

Electronic meetings are an extension of the 
communications revolution which started with 
electronic mail. It takes seconds to send a letter 
using electronic mail instead of days via 
regular mail. Certainly e-mail is a giant 
step in automating correspondence between 
two people. 

eForum goes yet further to provide 
immediate communications automation. But 
for groups. It creates electronic meetings 
which allow attendees to participate in 
discussions using the dynamic ebb and flow 
of points, counterpoints, comments and 


conclusions just like in-person meetings. 

From an economics point-of-view, eForum 
is the most cost effective method for bringing 
together the best minds in your company to 
meet on key issues—without the price of a 
single plane trip, the aggravation of schedule 
conflict or time-consuming delay. 
eForum is a communications breakthrough 
product. 

eForum lets you create electronic meetings 
with attendee lists as large as the company staff 
or as small as a three-person design team. 

Not only can eForum handle hundreds of 
meetings for your company, but, at the same 
time, limits each participant to only attending 
meetings to which he belongs. 




eForum, n. 1. Low cost electronic 
meeting system (as in needing no 
scheduling or travel to attend), v. 

1. Automatically organizes, indexes, 
files and leaves a complete written 
record of entire meeting. 2. Allows 
adding more attendees than normal 
at no extra cost. 3. Gives plenty of 
time to think before responding, 
adj. 1. Keeps everyone up-to-date. 

2. Doesn’t let geographic or time 
zones determine who can attend 
the meeting. 




The Electronic Meeting Manager 


If you have ever attended a meeting, 
you know how to use eForum. 


Simply attend eForum meetings any time 
convenient for you. Review new discussion 
materials. eForum keeps track of what you’ve 
seen. Enter your comments or new discussion 
points. Instantaneously, your ideas are 
available to every member of your eForum 
group regardless of geographic location. 
That’s productivity. 
eForum has the flexibility to fit your 
communications needs. 

• eForum 4000 - a national communications 
network available with a local phone call 
from most locations. 

• eForum 2000 - UNIX™ based central host 
software for supermicro and minicomputers. 

• eForum WS - software for the IBM PC and 


compatibles to interact with eForum central 
host software. 

Call 1-800-638-4832 or in Michigan call 

(313) 994-4030 collect for information on: 

• Automating your company’s meetings by 
using General Electric Information Service, 
the world’s largest communications network, 
to tie together your microcomputers and 
terminals. 

• Creating your own meeting network for your 
department or company. Software, hardware 
and leasing available. 

• Establishing OEM and VAR agreements 
to enhance the value of your software or 
hardware, with the communications power 
of eForum. Circle No. 46 on Inquiry Card 



Network Technologies 
International, Inc. 

The Arbor Atrium Building 
315 West Huron 
Ann Arbor, Michigan 48103 

'“UNIX is a trademark of AT&T Bell Laboratories 

' 'eForum is a trademark of Network Technologies International, Inc. 

(NETI) 
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(made by ICL), in that CAFS uses 
specially designed multiple-read¬ 
out disks [9]. A multiple-readout 
disk transfers data from all its 
disk heads at once and therefore 
sustains a much larger transfer 
rate than a conventional disk, 
which transfers data from one 
head at a time. CAFS has used the 
design to enable the fast scans of 
databases that are needed in 
applications where there are no 
useful indices. The absence of 
useful indices means you must 
read the whole file to get an 
answer to a query. The CAFS 
system speeds up that process 
because the disks themselves are 
functionally differentiated. 

This is not necessarily desir¬ 
able in all instances, however. In 
the case of Britton Lee, it was 
decided not to develop and manu¬ 
facture such disks because it was 
thought that the performance 
benefits of increased bandwidth 
would not justify the added cost 
and complexity. Decisions such 
as this riddle system design. In¬ 
deed, one of the harder parts of 
designing a dedicated system is 
deciding where to use general- 
purpose systems, and where to 
build specially tailored ones. In 
this particular case, it was decid¬ 
ed that the performance advan¬ 
tage of multiple-readout disks 
was significant only in those 
situations where users have no 
idea of what they should use as 
indices—an occurrence that’s 
rare in practice. 

IS IT WORTH IT? 

Making a functionally differen¬ 
tiated system is not easy. Special- 
purpose hardware and software 
must be developed and main¬ 
tained, and a means of working 
with general-purpose hardware 
and software must be devised. 
The question arises as to whether 
it’s all really necessary. If, for 
instance, you are having re¬ 
sponse time difficulties running a 


suite of matrix manipulation pro¬ 
grams, why not just buy a faster 
processor? If you want to increase 
the number of people using a 
database management system, 
why not just buy a faster general- 
purpose machine? The reason is 
simple. The cost/performance 
ratio is much better for function¬ 
ally differentiated systems: fast 
array processors are cheaper 
than equally fast general-purpose 


One of the harder parts 
of designing a 
dedicated system is 
deciding where to use 
general-purpose 
systems, and where to 
build specially tailored 
ones. 


machines. Likewise, fast data¬ 
base machines are cheaper than 
equally fast general-purpose com¬ 
puter systems packaged with 
DBMS software. It’s easy to un¬ 
derstand the cost/performance 
differences simply by doing a 
parts count: general-purpose sys¬ 
tems require more parts (compil¬ 
ers, interface requirements, com¬ 
patibility requirements, as well as 
a variety of devices). General- 
purpose systems also cost more to 
build and maintain than single¬ 
purpose systems—costs that you 
can be sure are passed along to 
end users. 

There are several factors that 
cause people to want to use 
backend database machines, as 
detailed in “Why Database Ma¬ 
chines?’’ [10]. The opportunity 


for heterogenous hosts to share a 
centralized database, and the 
ability of a database machine to 
act as a network server are 
among the most important fac¬ 
tors. These could be accom¬ 
plished without the use of a 
functionally differentiated sys¬ 
tem; a distributed database sys¬ 
tem would work equally well. But 
two other factors—reduced cost 
and faster database access— 
are powerful arguments favoring 
dedicated database machines. 

CONCLUSION 

If we take array processors as 
an example, we can see how the 
product evolution of database ma¬ 
chines is likely to proceed. Array 
processors first surfaced after a 
period of academic gestation. 
Ideas then became reality as a few 
small companies began manufac¬ 
turing array processors. 

The next logical step would be 
for major computer vendors to 
start up production. But this 
has not been true for either array 
processors or database machines. 
Why? Pretend you are a general- 
purpose computer maker. My 
speculation is that it does not 
make sense for you to offer low- 
cost alternatives to the high- 
priced high-end systems you 
already offer. To commit to manu¬ 
facturing array processors or da¬ 
tabase machines, you’d first have 
to overcome three objections. 
First, why dampen sales of the 
general-purpose system you’re al¬ 
ready offering? Second, why build 
competing product lines when 
you don’t have to? And, third, 
why take on additional develop¬ 
ment time? It takes a long time 
and a sizable commitment of 
resources to get a new product to 
market; you must decide if you are 
willing to make that commitment 
to a general-purpose product with 
a potentially wide customer base, 
or to a special-purpose machine 
with a necessarily narrower cus- 
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tomer base. If you already make 
general-purpose machines, the 
decision must be obvious. For this 
reason, backend machines re¬ 
main the domain of small, spe¬ 
cialized companies. 

It has taken a while for array 
processor companies to earn cus¬ 
tomer confidence in hardware not 
built by large general-purpose 
computer houses, but the cost- 
performance benefits offered by 
backend array processors, to¬ 
gether with good customer exper¬ 
iences with the companies that 
manufacture them has led to the 
the presence of several strong 
manufacturers and a relatively 
large number of array processors. 
The same pattern is emerging in 
the backend database machine 
realm. 

With a concurrent prolifera¬ 
tion of fast, cheap processors and 
fast communication networks, an 
increasing number of proposals, 
designs, and products have come 
to include functionally differen¬ 
tiated multiprocessor systems. 
Some of these, such as graph¬ 
ics display devices, operate as 
specialized front-end machines. 
Some, such as signal processing 
machines, act as semi-stand¬ 
alone systems. And some, such as 
array processors and database 
machines, act as backend sys¬ 
tems. The benefits of creating 
specialized devices for handling 
complex functions have become 
apparent. Further advances in 
the use of dedicated machines 
surely won’t be long in coming. 
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A. here is no trick to distin¬ 
guishing transaction processing 
from the sort of data processing 
typically found on UNIX systems. 
The two differ in both their trans¬ 
action administration needs and 
their database processing needs, 
though the latter requirements 
are much better understood in 
the UNIX system community 
than the former. 

Databases used in transaction 
processing, of course, must pay 
more attention to performance, 
concurrency, integrity, and ato¬ 
micity issues than do databases 
used for other purposes. Happily, 
a number of commercial DBMS 
products based on the UNIX 
system already have begun to 
solve some of these classic prob¬ 
lems. On another front, though, 
the transaction administration 
issues of control and tunability 
have been wholly unaddressed by 
the body of generally available 
database packages. What’s more, 
little attention has been paid to 
either process architecture effi¬ 
ciencies or performance-oriented 
forms handling. 

This should not be taken to 
suggest, however, that transac¬ 
tion processing within the UNIX 
system operating environment is 
impossible. In fact, transaction 
processing has been available for 
years inside AT&T, using inter¬ 
nally developed UNIX system 
tools. One such application runs 
300 bisynchronous terminals en¬ 
tering a total of 3500 transac¬ 
tions per hour, all on a DEC PDP 
11/70. With the availability of 
shared memory and interprocess 
communication in System V, 
transaction processing has be¬ 
come a particularly realizable 
goal since no kernel modifica¬ 
tions beyond the addition of com¬ 
munication device drivers are 
necessary. 



B 
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OMPUTING 


TRANSACTION PROCESSING 
VERSUS TIMESHARING 

Transaction processing appli¬ 
cations tend to be closely tied 
to activities that generate rev¬ 
enue. They can perform much of 


Use of the standard UNIX system kernel for 
transaction processing 

by Kathryn J. Anderson 
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action systems using fourth-gen¬ 
eration languages. The applica¬ 
tions are much too complex and 
performance is altogether too 
crucial. 

To do successful transaction 
processing requires that a system 
administrator be able to both 
control and tune the running 
application. “Control” requires 
that a mechanism exist for giving 
preference to important jobs, 
while the ability to “tune” allows 
the system administrator to dy¬ 
namically change the priority of 
jobs as needs change from day to 
day and hour to hour. 

Attention to control and tuning 
provides a useful basis for con¬ 
trasting transaction processing 
with the UNIX system’s tradition¬ 
al timesharing mindset. 

First, a timesharing workload 
is far less predictable than 
a transaction processing work¬ 
load. A timesharing administra¬ 
tor might know that 10 to 11 am 








is often scheduled to run during 
off-peak intervals. 

Because transaction process¬ 
ing is used so heavily in revenue- 
related activities, it generally is 
less cost-sensitive than other ap¬ 
plications. Companies are typi¬ 
cally willing to spend more both 
for operating expenses (in the 
form of well-paid system adminis¬ 
tration staffs) and development 
and maintenance expenditures 
(the price paid for data processing 
expertise) when they can see a 
direct link to revenue generation. 
The willingness to spend is 
buoyed by the fact that there 
aren’t many options. For the 
present at least, end users simply 
cannot develop their own trans¬ 


the drudge work necessary 
maintain accounts receivable, 
for instance. In the course of 
these activities, large, complex 
databases are acted on by a high 
volume of transactions. Most 
of these transactions are of 
the “bread-and-butter” variety, 
meaning that they involve only 
short interactions with the data¬ 
base (less than 10 reads and/or 
writes), arrive in predictable pat¬ 
terns, and are usually manipulat¬ 
ed by clerical people stationed at 
CRTs. Some of the data used in 
transaction processing must be 
available to many users simulta¬ 
neously, both for reading and 
updating. 

Transaction processing also 
involves some batch work, in¬ 
cluding both report and update 
programs. Processing of this sort 


Illustration by Victor von Heck 
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is the busiest part of an average 
day, but this offers no insight into 
the sorts of jobs that will be run 
during any one particular period. 
Without a clear notion of work¬ 
load or schedule, the system ad¬ 
ministrator cannot choose in ad¬ 
vance the jobs that should have 
highest priority. Control is thus 
lost. 

Secondly, even if the more 
important jobs could be identi¬ 
fied, the UNIX system offers no 
mechanism beyond superuser 
nice intervention for favoring 
them. Using nice alone is not 
sufficient. For instance, if a long 
running job, say the formatting of 
a large document using nroff, 
was temporarily of utmost impor¬ 
tance, nice could not by itself 
ensure that it would keep run¬ 
ning. That’s because the system 
offers no systematic capability for 
tuning. 

Unfortunately, the natural bias 
of UNIX system users toward 
timesharing and away from con¬ 
trol and tuning is apparent in the 
commercial DBMS products cur¬ 
rently available. 

TRANSACTION PROCESSING 
WITHIN AT&T 

One of the most widely de¬ 
ployed UNIX-based transaction 
processing systems is known 
as the Automated Repair Service 
Bureau (Bell System Techni¬ 
cal Journal , July-August 1982). 
ARSB was developed by Bell Lab¬ 
oratories to verify, record, and 
track customer phone service 
troubles for the Regional Bell 
Operating Companies. It is a dis¬ 
tributed system capable of run¬ 
ning 3500 transactions per hour 
on each of several PDP 11/70s 
connected via a local area net¬ 
work. The first UNIX system 
version of the ARSB was deployed 
in 1980, and today over 200 
ARSB installations are active 
across the country. Originally de¬ 
veloped on UNIX Release 4.0 with 


an early version of shared mem¬ 
ory and interprocess communica¬ 
tion, ARSB was recently ported to 
standard System V. 

The ARSB is just one of several 
dozen System V-based transac¬ 
tional application developments 
that have been developed at AT&T 
for internal use and external sale. 
Many of these systems use tools 
that evolved during the ARSB 
development. 

TRANSACTION 
ADMINISTRATION AND 
PROCESS ARCHITECTURE 

Key to transaction processing 
efficiency in the UNIX operating 
system is process architecture. 
Topical commercial DBMS prod¬ 
ucts use two processes per termi¬ 
nal, one for the terminal handler 
and one for database processing. 
These processes are often con¬ 
nected via a pipe. As might 
be guessed, processes multiply 
quickly under this approach. 
Had ARSB employed a similar 
scheme, for instance, it would 
have had to handle well over 
1000 concurrent processes (quite 
impossible on an 11/701). 


For transaction processing, the 
process-per-terminal approach is 
not much better than one once 
taken in AT&T by a group of 
System V rookies. These pro¬ 
grammers, well experienced in 
IBM’s IMS, were told to build their 
first UNIX system-based transac¬ 
tional application. In their initial 
design, the process-per-terminal 
that handled user input would 
fork/exec each different applica¬ 
tion as needed. The exec’d load 
module contained both applica¬ 
tion code and DBMS code. (See 
Figure 1 for a representation.) 
Fortunately, the system prototype 
only had to handle 12 users who 
each entered a maximum of 30 
transactions per hour. More for¬ 
tunate yet was the fact that an 
expert from ARSB was called in 
before the system actually went 
into production. 

The process architecture inef¬ 
ficiencies were obvious to the 
expert. First, the overhead of the 
fork/exec cost the system about 
200 milliseconds per transaction 
and often led to additional disk 
accesses. Second, the application 
process opened the database with 
each transaction, adding an extra 



Figure 1 
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250 milliseconds per instance. 
Third, memory was used ineffi¬ 
ciently since each transaction 
was encapsulated in a different 
load module. This was the most 
subtle of the problems. If five 
different terminals were involved 
in five different transactions, the 
application processes could not 
share text—a particularly waste¬ 
ful problem since 80 percent of 
the code, the DBMS part, was 
the same from application to 
application. 

In both the rookies’ approach 
and the typical commercial DBMS 
approach, there is no central 
point through which transaction 
requests can be funneled. As a 
result, there is no means of 
transaction administration—no 
way to control or tune transaction 
execution. Suppose 10 out of 50 
users were to start up long- 
running database scans. Since 
the typical DBMS product uses an 
application process per terminal, 
the requests can’t be throttled by 
single-threading them. Even if 
they could, how would the system 
administrator be able to see to it 
that some long-running transac¬ 
tions enjoyed a high priority? For 


example, a bill printing transac¬ 
tion is almost always more impor¬ 
tant than the printing of a mar¬ 
keting summary report, but it 
would be extremely difficult to 
tune for temporary favoritism in 
process-per-terminal schemes. 

The ARSB system allows for 
transaction administration by 
taking a “bucket brigade” ap¬ 
proach to process architecture. A 
set of about 100 cooperating pro¬ 
cesses, known as the Group, is 
started at system boot time by a 
module known as the Group Man¬ 
ager. Each process runs contin¬ 
ually as a daemon. Input is first 
processed by the terminal han¬ 
dler, which is a single user-level 
process capable of handling all 
300 terminals. The input is then 
passed via System V messages to 
a parser/validator, where it in 
turn is passed to an appropriate 
application process running back 
and forth to one of two identical 
DBMS processes. One transac¬ 
tion visits an average of five 
processes, while three or four 
transactions reside together at 
any one time in a “brigade”. (See 
Figure 2 for a representation.) 

This implementation removes 


the process architecture ineffi- 

ciencies of more conventional 
DBMS approaches. Daemons that 
are running continually cause 
only one fork/exec to be executed 
per day, just as database pro¬ 
cesses only need to open appro¬ 
priate databases once a day. Du¬ 
plicate modules, in the meantime, 
can share text. Control and ad¬ 
ministration, however, are im¬ 
plicitly achieved. The Group is 
pre-tuned for the ARSB applica¬ 
tion and hardware configuration, 
both in terms of the means by 
which transactions proceed from 
process to process, and the num¬ 
ber and type of processes that are 
started. 

All of these advantages also 
can be achieved using other ap¬ 
proaches. In the case of the 
rookies’ application, a “name 
server” strategy was used to solve 
their process architecture prob¬ 
lem. As in the ARSB, System V 
messages were used, and there 
were far fewer processes than 
users. However, the architecture 
was not implicit in a Group 
Manager—instead, a requesting 
process used a name server to see 
where it needed to go to get the 
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next step done. This solution 
extended the ARSB approach, 
allowing for more flexible control 
and more direct tuning, and thus 
now is evolving into a standard 
transaction processing architec¬ 
ture within AT&T. 

The name server does provide 
a means of control by allowing 
administrators to de-queue re¬ 
quests in priority order. The in¬ 
formation determining priority 
might include expected runtime, 
relative importance of transac¬ 
tions to business needs, and 
percentage of original request 
completed by previously-visited 
processes. Specific long-running 
database scans can be throttled 
by having only one daemon, re¬ 
presented by a single entry in the 
routing table. The most impor¬ 
tant long runners may have two 
or more entries. 

The name server also provides 
tuning capabilities. Transaction 
priority information can change 
over time; for example, queries 
could be more critical than up¬ 
dates most of the time, but be far 
less important in rare instances, 
such as at the end of the month. 
The system administrator can 
respond to this by adding and 
deleting daemons on the name 
server’s tables. 

Further, name servers provide 
for the real-world occurrence of 
system overload. By making note 
of a systemwide upper threshold 
of work, it can tell a requesting 
process, “System busy, please try 
again later.” 

A degree of recoverability 
can also be provided this way. 
In at least one case I’m aware 
of, redundant daemons were in¬ 
stalled in the trial site for a new 
application. When one crashed, 
the other picked up all incoming 
work without delay. 

The name server concept has 
been implemented by various 
AT&T projects as device drivers or 
as user-level code. Some applica- 


Systems providing 
quality transaction 
processing service 
have been built on top 
of standard System V 
without any kernel 
modifications 
whatsoever. 


tions use a binary request model, 
where responses are always sent 
directly back to originating pro¬ 
cesses. Some other applications, 
though, use table-driven process 
routing. Results from one server 
are routed dynamically to either 
the next appropriate server or 
back to the original client. 

INPUT HANDLERS 

Experience shows that trans¬ 
action processing involves an 
entire family of input handlers. 
Clerks use fixed forms to enter 
bread-and-butter transactions 
every minute or so from CRTs. 
System administrators and more 
sophisticated end users interact 
with the applications on a more 
ad hoc basis. And, there’s often a 
non-CRT form of input that must 
also be read and processed. 

Terminal handlers designed 
for clerical users generally do not 
bother with pop-up windows, full¬ 
screen friendly error messages, or 
fancy help. The object, after all, is 
efficiency, and these users are 
so well trained for specific inter¬ 
actions that fancier features 
are superfluous. Handlers de¬ 
signed with clerical users in mind 
thus should refrain from expen¬ 


sive field-by-field validation com¬ 
munications with the DBMS 
since errors are not common. To 
avoid the process-per-terminal 
syndrome, a software multiplexor 
is commonly preferred. 

For the system administrator 
or sophisticated end user, though, 
fancier forms are appropriate. 
Since clerks presumably out¬ 
number such users, efficiency is 
not as critical. Thus, for them, 
a process-per-terminal architec¬ 
ture is often acceptable. The 
needs of the sophisticated user 
are less predictable than those of 
the clerk, so multiple windows, 
mouse controls, and elaborate 
help also become important. 

Even these non-terminal input 
handlers and fancy form han¬ 
dlers must be easily integrated 
into the process architecture, 
though. Despite their small num¬ 
bers, sophisticated users could 
generate as much transaction 
load as clerks requesting monthly 
summaries. The controls that 
apply to bread-and-butter trans¬ 
actions thus must also rule ad hoc 
work. 

A caching mechanism for form 
definitions is also valuable if one 
is to avoid going to disk for each 
form request. One terminal han¬ 
dling system I’m aware of pro¬ 
vides a shared-memory buffer 
that can be pre-loaded with popu¬ 
lar form definitions. The segment 
of the buffer not taken by pre- 
loaded forms is managed by the 
system in a least-recently-used 
fashion. 

A FEW DATABASE ISSUES 

Among the absolute necessi¬ 
ties for DBMS applications used 
in transaction processing are 
high performance, atomic trans¬ 
actions, recovery, and concur¬ 
rency control. These are topics 
that are covered extensively in 
other literature, including several 
of the other articles in this maga¬ 
zine, so I won’t belabor them 
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here. But experience suggests 
that there are also other needs 
that are not quite so obvious. 

The various files involved in a 
complex database are not homo¬ 
geneous. Thus, it is necessary 
to vary many parameters file- 
by-file, including block size, 
hash function, logging strategy, 
and buffer-pool management al¬ 
gorithms. For example, one net¬ 
work-model DBMS developed and 
used widely in AT&T provides 
several buffering strategies, se¬ 
lectable per file at database gen¬ 
eration time, including: 

• a memory-only (i.e. non-disk) 
strategy used for temporary 
data needed only during the life 
of a single transaction. 

• a permanent in-memory stra¬ 
tegy useful for small files that 
are frequently accessed. 

• a least-recently-used strategy 
for traditional files. 

Further, the performance that 
characterizes complex databases 
makes a two-tiered interface to 
the DBMS necessary for transac¬ 
tion processing. One tier should 
offer a navigational, record-at-a- 
time interface while the other 
should provide a non-procedural 
interface. 

The navigational interface, im¬ 
plemented via a programming 
language, is the one most appro¬ 
priate for bread-and-butter trans¬ 
actions. Granted, it takes a week 
or two for a programmer using 
such an interface to develop a 
transaction that does inserts, 
modifies, retrieves, and deletes on 
a single record, while the same 
transaction could be developed in 
just a few hours using one of the 
forms-oriented application gener¬ 
ators provided with many com¬ 
mercial DBMS products. But 
trades of bread-and-butter trans¬ 
action runtime efficiency for re¬ 
duced development time are not 
cost-effective. Since these trans¬ 
actions can be executed thou¬ 


sands of times an hour, the 
overhead of interrogating the 
data dictionary at the time of each 
request accumulates quickly. 

Application generators and re¬ 
lated ad hoc query systems are 
useful in transaction processing, 
though. End user report develop¬ 
ment and batch program develop¬ 
ment in the DP shop would be 
quite difficult without them. For 
reports that are only generated 
infrequently, it makes no sense to 
trade development time for run¬ 
time speed. Application genera¬ 
tors and ad hoc query systems are 
also useful in rapid prototyping. 

CONCLUSIONS 

To do transaction processing 
requires two things: a bias toward 
performance in data manage¬ 
ment and forms handling, and a 
process architecture amenable to 
system administration. 

There is much discussion now 
in the UNIX system community 
about the need to modify the 
kernel for transaction process¬ 
ing. File system and scheduler 
enhancements are themes that 
are especially hot. There’s no 
question but that these features 
would be useful. But, while we 
wait for them to surface, note that 
systems providing quality trans¬ 
action processing service have 
been built on top of standard 
System V without any kernel 
modifications whatsoever. 


Kathy Anderson was a System V 
rookie, well experienced in IBM's 
IMS, when she was asked to write 
her first UNIX system-based appli¬ 
cation. She has been with AT&T 
Bell Laboratories for 12 years, the 
last two of which have been spent 
supervising the development and 
support of a System V transaction 
monitor and database management 
system for use within AT&T She 
currently supervises the Transac¬ 
tion Systems Group at Bell Labs.M 
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RULES 

OF THE GAME 

What's in a name? 

by Glenn Groenewold 


When I was a child, my father 
worked for an oil company. One 
day he happened to mention that 
since his employer had no refin¬ 
ery in our area, the gasoline we 
were buying really came from a 
competitor, though it was mar¬ 
keted under his company’s name. 

I was astonished by this revela¬ 
tion. Hadn’t I been hearing and 
seeing advertising in which his 
company’s product was pro¬ 
nounced superior to all compet¬ 
ing ones? 

Patiently, my father explained 
that gasoline, basically, is gaso¬ 
line, and that the equivalent 
products from other major com¬ 
panies were quite interchange¬ 
able. They were differentiated 
chiefly by what 1 have now 
learned to recognize as such 
things as brand names, logos, 
and slogans. These devices are at 
the core of modern marketing. 

Even after we learn this truth, 
many of us continue to permit our 
choices to be governed to some 
extent by marketing factors irrel¬ 
evant to the product itself. We do 
this each time we select the can 
with the familiar label from 
among the stacks of canned to¬ 
matoes in the supermarket. Per¬ 
haps it’s habit, or maybe we 
resort to familiar products simply 
because it saves us time and 
mental effort. 

But what has this to do with 



computing? Plenty, some people 
say. They point out that the 
typical software user usually 
doesn’t care how the programs 
competing for his or her attention 
achieve their ends. Oh, a sophisti¬ 
cated few may take the trouble to 
investigate such things as how 
much memory each program re¬ 
quires, but these discriminating 
consumers are a distinct minor¬ 
ity. Most users, it’s claimed, will 
base their selection on the same 
sorts of factors that determine 
their choices of gasoline or 
canned tomatoes. 

If this view is correct, product 
superiority shrinks in importance 
as a determinant of success in 
the software marketplace, while 
such items as trade names and 
logos correspondingly loom larg¬ 
er. Let’s take a look at how the law 
views these marketing devices. 


Federal and state laws protect 
several categories of proprietary 
marketing aids, and some of 
these overlap. From our stand¬ 
point, one of the most important 
is trademarks. 

A trademark relates to goods , 
and computer software is consid¬ 
ered such. Broadly speaking, a 
trademark is something that a 
manufacturer or merchant uses 
to distinguish his or her goods 
from those of others. This can be 
a name or words, or it can be 
something like a symbol or pic¬ 
ture. Logos and corporate sym¬ 
bols therefore can constitute 
trademarks. 

An important thing to remem¬ 
ber about trademarks is that they 
cannot exist independently of a 
product. This means you can’t set 
aside a trademark for use with 
some product you intend to devel¬ 
op in the future. The trademark 
must be placed on the goods 
themselves, or on their contain¬ 
ers or the “displays connected 
with them’’, or on labels affixed 
to them. You can’t establish a 
trademark just by using it in 
advertising, on letterheads, or on 
invoices, for example. 

Another category of names, 
words, symbols, and the like that 
can be protected under the law is 
service marks. These are almost 
exactly like trademarks, except 
that they relate to services in- 
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stead of goods. In the computing 
field, service marks could be 
important to someone contract¬ 
ing to perform systems manage¬ 
ment, for example. 

The third category of interest is 
trade names. This simply has to 
do with the names under which 
people or companies do business. 
Trade names do not necessarily 
attach to particular goods or ser¬ 
vices, although they may. Some¬ 
times a company’s trade name 
and its trademark are the same; 
two obvious examples are Exxon 
and Xerox. 

A MIXED BAG 

The laws governing protection 
of trademarks, service marks, 
and trade names are based on 
common law principles. (Com¬ 


mon law was discussed in this 
column last December. In brief, it 
encompasses the body of “judge 
made’’ legal rules that originated 
in England several centuries ago.) 
Since the laws passed by Con¬ 
gress don’t cover everything in 
this area, it is often necessary to 
look to the laws of the various 
states—most of them have some 
sort of trademark legislation—or 
even to common law itself for the 
answers to questions that arise. 

Federal law, and the laws of 
many states as well, provides for 
registration of trademarks and 
service marks under specified 
circumstances. However, federal 
statutes do not provide for regis¬ 
tration of trade names. 

Registration of a trademark or 
service mark is advantageous be¬ 


cause it creates a legal presump¬ 
tion that the mark is valid. This 
can be quite useful if it becomes 
necessary to go to court to stop 
someone from infringing your 
mark. 

Under federal law, a trademark 
must take on what is known as a 
secondary meaning in order to 
be registrable. It’s difficult to 
define this term; among the 
trademarks that have successful¬ 
ly passed the test are Chap Stick 
and Holiday Inn. A trademark 
can’t merely be descriptive of the 
product, nor can it be recogniz¬ 
able only as the surname of the 
manufacturer or seller. (However, 
an individual’s name can be 
registered as a service mark, 
where it has become associated 
by the public with a particular 
type of service; one example 
would be entertainer Johnny 
Carson.) 

Once a trademark or service 
mark has been accepted for regis¬ 
tration under the federal statute, 
the owner of the mark is entitled 
to use the symbol ® in conjunc¬ 
tion with it. It’s against the law to 
make use of this symbol in any 
other situation. 

A number of well-known trade¬ 
marks and service marks have 
not been registered, sometimes 
for esoteric legal reasons. In an 
attempt to protect these unregis¬ 
tered marks, the informal prac¬ 
tice of using the letters “TM” to 
identify a trademark and “SM” 
to identify a service mark has 
evolved. Though these abbrevia¬ 
tions have no official standing, 
they do serve notice on persons 
who see the marks that an owner¬ 
ship right is being asserted. 

UNIX is among the trademarks 
that have not been registered. 
Since it’s particularly important 
that the owner of an unregistered 
mark be vigilant against the ap¬ 
propriation of the mark by others, 
it’s not surprising that AT&T’s 
lawyers have been quite busy 


CINIX/XENIX Communications 
Available NOW! 


Put your 
computers on 
speaking terms. 



*295°° 


>*s***'~f"' 

TERM. Communications Software 

Everyone from the beginning computer user to the expert finds TERM easy to learn and Powerful!o use .Just 
plug it in and go! In a few keystrokes you can access a remote database or send a group of files to another 
system. 

TERM allows your computer to perform efficient, error-free exchange of binary or text files, over phone lines 
or hard-wired circuits at speeds of up to 9600 baud. Available options allow you to include or exclude a group 
of files for transfer in a single command. 


of sessions with remote mainframe and minicompu- 

• Modem7 protocol for remote bulletin boards 

• Auto-dial/Answer and Hangup supported on 
Hayes Smartmodem 300/1200 and compatibles 

• Programmable batch file capability 

• Unattended file transfer/auto logon 

• Translation tables for input and output 

• Remote maintenance capability 

Term is available NOW on the Altos 586. IBM AT. Tandy Model 16, AT&T 362 and IBM PC/XT. MSDOS and many 
others Find out how easy it is to get your UNIX. Xenix, and MSDOS machines all talking together 


TERM’S "data capture" feature allows saving transcripts 
ters to disk for later editing or printout, if desired 

• Pre-installed and ready to run 

• Automatic error checking and re-transmission 

• Wildcard (V*) file send/receive, capability 

• Xon/Xoff. Etx/Ack. Ascii protocols for com¬ 
munications with non-TERM systems 

• Full/half duplex emulation mode for remote 
systems 




CENTURY 

SOFTWARE 

We make It easy tor you. 


9558 South Pinedale Circle 
Sandy, Utah 84092 
(801) 943-8386 


Circle No. 21 on Inquiry Card 


68 UNIX REVIEW AUGUST 1985 








writing letters telling people to 
watch their use of the UNIX 
trademark. The reason that the 
company’s licensing agreements 
contain specific restrictions on 
licensees’ uses of the trademark 
should also be apparent. 

SELECTING—AND 
PROTECTING—A NAME 

Registration will be denied any 
trademark or service mark that 
appears to be in use by someone 
else. So it’s important that this be 
determined before you launch 
your product or service. There are 
search companies that will pro¬ 
vide this investigation—for a fee, 
of course—and there are legal 
publications that contain alpha¬ 
betical lists of trademarks that 
have been registered or have been 
upheld in court decisions. It’s a 
good idea to consult your local law 
librarian for particulars. 

Once you have selected a mark 
and have begun to use it, your 
work has only begun. It’s vital 
that you take action any time you 
get wind of someone else using 
the same name or symbol to 
identify another product or ser¬ 
vice, or whenever you hear of 
someone else using your mark 
without indicating so. In either 
case, you should write a letter 
telling the offenders to “cease 
and desist’’ whatever it is they’re 
doing. And if they don’t? Well, 
then it’s time to see your lawyer. 

THE GENERIC PITFALL 

In American law, there often 
are penalties for succeeding too 
well, and this area is no excep¬ 
tion. If you promote your mark 
so efficiently that it becomes a 
generic identification of your par¬ 
ticular type of good or service, you 
stand to lose it. 

There have been some cele¬ 
brated debacles of this sort—one 
of the earliest casualties was 
aspirin —and there also have 
been a number of close calls. 


Kodak and Coca-Cola (for “Coke”) 
have had scares at one time or 
another, and many of us will 
remember Xerox Corporation’s 
institutional advertisements re¬ 
minding us that its trademark is 
not a verb. Just recently the 
makers of Toll House cookies 
have seen their trademark pass 
into the public domain as just 
another part of the language. 

Perhaps what will serve to save 
some creators of software from 
becoming the victims of their own 
success in promoting trademarks 
is simply the ephemeral nature of 
the product. By the time a pro¬ 
gram has become well-enough 
known to the public that its 
trademark is in danger of becom¬ 
ing a generic term, it may well 


have been rendered obsolete. 

Short-lived or not, trademarks 
and service marks appear des¬ 
tined to be a major factor in 
determining marketplace success 
in the computer industry. While 
they do not in any way protect the 
product itself, they may be deci¬ 
sive in dictating how many people 
purchase it. That just might make 
the marks more important than 
all the copyrights, patents, and 
trade secrets put together. 

Glenn Groenewold is a California 
attorney who devotes his time to 
computer law. He has served as an 
administrative law judge, has been 
active in trial and appellate work, 
and has argued cases before the 
state Supreme Court. ■ 
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Why the RISC route? 

by Mark G. Sobell 


“RISC” is a term one hears 
often in conversations these 
days. The reason is simple: RISC 
describes one of the most exciting 
directions in current technology. 
The acronym stands for Reduced 
Instruction Set Computer. An in¬ 
struction set in a RISC architec¬ 
ture may consist of 50 or so 
simple instructions that can play 
the same role previously assigned 
to hundreds of complex instruc¬ 
tions. Most computers available 
today, whether mainframe or 
micro, rely on the larger instruc¬ 
tion sets. Because the RISC in¬ 
struction set is simpler, a ma¬ 
chine built around it can run 
significantly faster than compa¬ 
rable units using conventional 
architectures. 

One of the design goals for a 
RISC is to reduce the number of 
side effects instructions generally 
have. No longer is it necessary to 
have auto-incrementing, fancy 
type addressing, or instructions 
that “move a character string 
terminated by a null byte from 
here to there”. Speed of execution 
has been increased as a result 
because simpler instructions run 
faster. Simpler instructions also 
mean cleaner exception process¬ 
ing: there is less to deal with 
when an interrupt occurs. The 
RISC approach appears likely to 
avoid one of the major problems 
confronting chip manufacturers 
today: the chore of getting excep¬ 



tion processing on new, complex 
chips to work correctly in all 
cases. 

Another feature most RISCs 
offer is the ability to clean micro¬ 
code off processor chips. Micro¬ 
code is CPU chip software that 
breaks down complex machine 
instructions into a series of 
simpler instructions executable 
by the chip. Under RISC archi¬ 
tecture, each instruction is sim¬ 
ple enough to go directly to 
the chip. To compensate, the 
smarts that were previously 
embedded as microcode can now 
be moved to the software (compi- 
ler/assembler) level. Because it is 
at a lower level, RISC object code 
tends to be about 150 percent of 
the size of conventional code, but 
this bulk is more than compen¬ 
sated for by increased processor 
throughput. 

RISC and VLSI (Very Large 


Scale Integration, a technology 
used in constructing micropro¬ 
cessors) seem like technologies 
meant for each other. The most 
complex, expensive, and time- 
consuming part of VLSI chip 
design is the layout and debug¬ 
ging of chip logic (the part of the 
chip that executes instructions). 
The easiest part is the design of 
the data area of the chip since 
data areas are regular and can be 
replicated many times. A RISC 
implemented in VLSI technology 
offers the advantage of minimiz¬ 
ing chip logic work while empha¬ 
sizing data area design. 

DIFFERENT STYLES OF RISC 

RISC is a generic term that 
people have attached many dif¬ 
ferent meanings to. Many of the 
distinctions between different 
RISCs come from the information 
stored in the data portion of the 
VLSI chip. 

The IBM/Stanford style RISC 
uses the data portion of the VLSI 
chip for general-purpose regis¬ 
ters. An optimizing compiler 
must be used to take maximum 
advantage of these registers and 
increase processor throughput. 

The Berkeley style RISC uses 
the data portion of the chip for 
register windows. These windows 
essentially provide for fast access 
by keeping the top of the stack, 
or several different stacks, on 
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the chip. The RISC design 
employed by Pyramid Techno- 
logy Corp. uses register windows 
while simultaneously implement¬ 
ing instructions by way of 
microcode. 

It is also possible to use the 
data area of the VLSI chip for 
cache memory (high-speed, im¬ 
mediately available memory), but 
current technology does not allow 
you to store enough data on the 
chip to make this approach prac¬ 
tical. Finally, data areas also can 
be used for memory management 
functions. 

All of the big manufacturers 
are already at work on RISCs: 
HP’s next generation of ma¬ 
chines, codenamed “Spectrum”, 
will be RISCs. Rumor has it that 
DEC West is working on two 
RISC-based products. IBM is ex¬ 
pected to announce a RISC work¬ 
station later this year. And AT&T 
is reported to be working on a 
RISC project targeted at running 
UNIX efficiently. 

MIPS COMPUTER SYSTEMS 

The acronym mips stands 
for Millions of Instructions Per 
Second. It is a measure of 
raw computer power: how many 
instructions can a processor pro¬ 
cess in a single second? A VAX 
11 /780 is rated around 1.0 mips. 
A Motorola 68010 comes in at 
between 0.5 and 0.6 mips while a 
68020 can do 1.0 to 1.5 mips (or 
slightly faster in some cases). 

MIPS is also the name of a 
Silicon Valley startup that just 
received $9 million in venture 
financing to produce a RISC. 
According to John Mashey, one of 
the Bell Labs PWB/UNIX team 
members who now serves as 
Manager of Operating Systems at 
MIPS, boards using the first pass 
of the MIPS chip will run at 
between 3.0 and 5.0 mips, de¬ 
pending on memory type. That 
would assume that the propri¬ 
etary chip was running at 8 MHz. 


Mashey expects the second iter¬ 
ation of the chip to double that 
speed and nearly double the mips 
rating. Imagine, if you can, a 
single-board, desktop computer 
offering the power of eight VAX 
780s. 

Why else would venture cap¬ 
italists invest so heavily in a 


The smarts that were 
previously embedded 
as microcode can now 
be moved to the 
software (compiler/ 
assembler) level. 


company going up against IBM, 
AT&T, DEC, and HP? Probably 
because, in addition to the tech¬ 
nology, they looked at the tech¬ 
nical accomplishments of the 
players on MIPS’ side: John Hen- 
nessy, former project leader of 
the MIPS RISC effort at Stanford 
University that produced a RISC- 
based microprocessor that 
outperformed commercial micro¬ 
processors by a factor of five. 
John Moussouris, the former IBM 
liaison to Stanford and Manager 
of VLSI System Integration at 
IBM’s Thomas J. Watson Re¬ 
search Center, where he designed 
the logic for a very high perfor¬ 
mance 32-bit RISC-based VLSI 
processor. Edward Stritter, chief 
architect of the Motorola 68000. 
Todd Basche, architect of the 
Apollo DN660 workstation. Les 
Crudele, architect of Motorola’s 
68010 and 68020 processor fam¬ 
ily. And, of course, John Mashey, 
a major contributor to the Pro¬ 
grammer’s Workbench version of 
UNIX at Bell Labs. The list con¬ 


tinues, including more talent 
from IBM, Intel, Zilog, and Data 
General. 

OVERVIEW OF A RISC SYSTEM 

One of the issues facing anyone 
developing a RISC is the question 
of just what instructions to put on 
the chip. “When you minimize 
what is on the chip, you’d better 
make sure you’ve picked out the 
right stuff,” Mashey said. “We 
will, of course, run UNIX. We set 
up the chip with UNIX in mind, 
especially tuning the areas of 
memory management and excep¬ 
tion (interrupt) handling.” 

Aside from instruction set con¬ 
siderations, RISC compilers are 
critical to performance. “MIPS 
licensed the technology developed 
at the Stanford MIPS project and 
is using it as a base for further 
compiler and optimizer develop¬ 
ment,” Mashey explained. “We 
developed backends for optimiz¬ 
ation, code generation, and as¬ 
sembly. We developed front-ends 
for Fortran, Pascal, and C. An 
optimizing compiler is a good fit 
with MIPS-style RISC. It helps 
minimize code size, and only a 
good global optimizer can take 
advantage of the large number of 
registers.” 

The theoretical goal of a RISC 
is to execute one instruction 
per basic machine cycle. To help 
achieve this goal, the computer 
overlaps instruction execution 
in what is called an instruction 
pipeline. The computer is always 
starting work on new instruc¬ 
tions before completing previous 
ones. This technique works 
well—until the machine runs 
across an instruction that’s de¬ 
pendent on one that precedes it. 
There are several solutions to this 
problem, including hardware-im¬ 
plemented pipeline interlocks 
that can stall the computer until 
all the necessary instructions 
have run to completion. The fol¬ 
lowing source code and the result- 
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ing assembler code demonstrate 
the problem. The examples are 
not written in any particular 
language; they are simply concep¬ 
tual representations: 


they start execution. A simple 
software solution would be to 
have the assembler insert do- 
nothing instructions (rxops) be¬ 
fore each of the instructions 
marked with asterisks. Although 


moves it around so that it will run 
most of the time without the need 
for rtops. As an example, it might 
generate the following code based 
on the preceding example: 


Problem source code: 

this solution works, it is ineffi¬ 

Assembler code modified by 

A = B + C 


cient because it wastes the com¬ 

the MIPS Pipeline 

Reorganizer 

D = E 


puter’s time. 





Most computers use hardware 

load word 

register-! B 

Assembler code generated by problem: 

nops , or interlocks, to avoid the 

load word 

register-2, C 



software nops that would other¬ 

* load word 

register_3, E 

load word 

register..! B 

wise be required. MIPS, however, 

add 

register—! register_2 

load word 

register_2. C 

has come up with another, more 

store word 

register—! A 

♦ add 

register..! register_2 

efficient software solution called 

store word 

register_3, D 


store word register—! A 

load word register_3, E 

* store word register_3, D 

The instructions marked with 
asterisks depend on previous 
loads being complete by the time 


the Pipeline Reorganizer. (As a 
matter of fact, another thing 
MIPS stands for is “Microproces¬ 
sor without Interlocked Pipeline 
Stages”.) The Reorganizer takes 
the assembler code that compil¬ 
ers or programmers generate and 


In this example, the line with 
an asterisk is the one that the 
Reorganizer moved. By taking the 
instruction that loads register—3 
and putting it where it otherwise 
would have needed a nop, the 
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Reorganizer got rid of the need for 
two nops (the one it replaced with 
the load instruction, and the one 
that would have otherwise been 
required between between the 
load register—3 and store regis¬ 
ters instructions. 

The MIPS Reorganizer puts 
useful code in 80 to 90 percent of 
the delay slots that would other¬ 
wise be filled with nops. What’s 
more, 20 percent of the instruc¬ 
tions ultimately executed typical¬ 
ly will be placed in these delay 
slots. That means the Reorganiz¬ 
er provides 20 percent more 
throughput than schemes that 
insert nops in delay slots. 

This example only hints at how 
MIPS is moving some of the 
intelligence of its computer from 


the hardware/microcode arena to 
the software side of things. In 
addition to allowing the hardware 
to be much simpler, this approach 
also makes it much easier to 
modify software (since, as you 
might guess, microcode embed¬ 
ded on a microprocessor chip is 
not easy to modify). 

SUMMARY OF RISC 

Mashey summed up the major 
RISC issues by saying, “Complex¬ 
ity in a computer is like garbage. 
You can’t ignore it, but you can 
choose where to put it. 

“Prior to RISCs, the trend 
in microprocessor architecture 
was toward putting as much 
intelligence on the microproces¬ 
sor chip as possible, as in com- 


HUNIXH 

POWER 


100,000 software developers can’t be 
wrong* 

UNIX is the chosen operating system for 
more than 100,000 software developers 
because it has the power they need. But 
developers aren’t the only people who need 
computing power. Any business that wants 
multi-users to access the same files at the 
same time or wants to simultaneously run 
multi-task operations... needs UNIX. At 
Dynacomp, we offer UniPlus + ® System V 
by UniSoft Corp. For $1495. U.S. dollars 
you can run UNIX on the CompuPro® 
816/E” ... a powerful 68K 
S-100 bus computer system that 


FROM 



COMPUTER SYSTEMS LTD. 


maximizes its memory for multi-user/multi- 
task operations. 

UniPlus + includes all the standard 
UNIX System V features PLUS perform¬ 
ance enhancements found only in 
UniPlus +. These features increase the 
portability, flexibility, and performance of 
UNIX, allowing an affordable operating sys¬ 
tem for program development, text prepa¬ 
ration, and general office use. 

If it’s time for you to upgrade to UNIX, 
call your local Full Service CompuPro Sys¬ 
tem Center in the United States or call 
Dynacomp in Canada for com¬ 
plete details. 


46-6535 Mill Creek Dr. 
Mississauga. Ont. 

L5N 2M2 
(416) 826-8002 


*AT&T estimates that there are more than 100,000 people currently developing software under UNIX. Dynacomp serves all of Canada and parts of 
Asia and the Pacific Rim. Call us for details and information on our full product line including Plexus. • UNIX is a trademark of Bel 
CompuPro is a registered trademark and System 816/E is a trademark of Viasyn Corp. UniPlus + is a registered trademark of Umsoft Corp. AT&T 
is a registered trademark of AT&T Information Systems. 

Circle No. 28 on Inquiry Card 


plex instructions decoded by mi¬ 
crocode. With the advent of 
RISCs, more of the intelligence is 
moving into the software, where 
it is cheaper and easier to imple¬ 
ment, debug, and modify. 

“And of course the ability to 
implement a RISC using VLSI 
technology gives you a better 
price/performance ratio than 
was previously possible. The 
price/performance ratio worsens 
significantly when you go from 
a single chip implementation of 
a RISC to an on-board implemen¬ 
tation to a multiple-board 
implementation.’’ 

Unfortunately, you cannot run 
out and buy a MIPS computer just 
yet. MIPS Computer Systems does 
not plan to start shipping for 
another year. And, when it does, 
it will be selling to OEMs who will 
then build products around the 
boards. 

For more information on RISC 
and VLSI technologies, you might 
wish to refer to: “VLSI Proces¬ 
sor Architecture’’, IEEE Trans¬ 
actions on Computers , vol. c- 
33, no. 12, Dec. 1984, and “Re¬ 
duced Instruction Set Comput¬ 
ers’’, Communications of the 
ACM , vol. 28, no. 1, Jan. 1985. 

If you have an item 
appropriate for this column, 
you can contact Mr. Sobell at 
333 Cobalt Way, Suite 106, 
Sunnyvale, CA 94086. 


Mark G. Sobell is the author of 
the bestselling book , “A Practical 
Guide to the UNIX System ” (Ben¬ 
jamin/Cummings, 1984) and the 
new “A Practical Guide to UNIX 
System V” (Benjamin I Cummings, 
1985). He has been working with 
UNIX for over five years and 
specializes in documentation con¬ 
sulting, database programming, and 
troff typesetting. Mr. Sobell also 
writes, lectures, and offers classes in 
Advanced Shell Programming and 
awk. ■ 
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UNIX™ APPLICATION 
DEVELOPMENT 

TODAY is far more than the 
awkward collection of tricks and 
tools that are often labelled 
“4GL”. TODAY provides a 

COMPLETE application 
development environment that 
will revolutionize the way you 
develop and maintain applications. 

No UNIX* systems knowledge 
is necessary. 

Let’s put it frankly: developing 
an application is a costly pro¬ 
position. You’ll need a highly 
skilled team of designers, analysts 
and programmers, and several 
man-years to get things off the 
ground. And that’s not to mention 
the on-going costs of documenta¬ 
tion, customization and 
maintenance! 

TODAY tackles these problems 
through a new methodology with 
high performance architecture 
and a comprehensive range of 
features. It’s so quick and easy to 
use that TODAY developers can 
do the whole job—design, 
analysis, development and 
documentation. 

TODAY provides a compre¬ 
hensive range of features that 
keep application building easy 
while optimizing development 
resources: 

• Powerful recursive logic and 
Decision Tables 

• Synonyms, Menus, Prompts, 
Helps and Defaults for 
streamlined definitions 

• Screen Painter 

• A Report Generator which 
includes a Painter 


• Push button Self¬ 
documentation 

• Audit Trails 

• Source-code security through 
run-time only configurations 

• Developed Applications 
instantly portable across 
UNIX* systems 
Because definitions are 

Dictionary-based, any changes 
are easily made in one central 
location. A key feature, 

“tailoring” lets you alter an 
application — perhaps to 
customize it for a particular site 
or user — without affecting the 
original version. If required, 
applications can be set up as 
Models (Prototypes) and later 
enhanced to grow and change 
with the business. Tailoring 
versions is the perfect solution for 
quickly generating multiple 
applications based on one Model. 

TODAY runs under UNIX* or 
UNIX*-compatible operating 
systems on super-mini down to 
micro business computers using 
any of a range of databases. And 
if that’s not enough, TODAY is 
backed by 14 man-years of 
research and development and 
the confidence of users who are 
breaking time zones in software 
development. See us at Interex, 
Washington DC, September 
8-13, Booth 714 , and UNIX 
Systems Expo, New York City, 
September 18-20, Booth 1303. 


bbj Computer Services, Inc. 
2946 Scott Blvd. 

Santa Clara, CA 95054 
Telephone: (408) 727-4464 

Circle No. 3 on Inquiry Card 


Cure for Backlogs 
Induced by 3GLs 

in EDP Departments, 
Software Houses 
& Others 


Unix is a trademark of A T & T Bell Laboratories Inc. TODAY Copyright © bbj Computer Services Pty Ltd. Melbourne. Australia, July 1983 





DEVIL'S 

ADVOCATE 


Look for that silver lining 

by Stan Kelly-Bootle 


Many of you doubtless are 
aware of recent hiccups in the 
erstwhile exponentially expand¬ 
ing Computer Industry. You’ll be 
cheered to know I refuse to join 
those gloomy commentators who 
scream “major recession” and 
“Armageddon” every time Silicon 
Valley shuts down for longish 
weekends ranging from five to 15 
days. God knows those of us in 
the business have earned a 
break: the company parking lot 
holds an adequate inventory of 
workstations, my own UNIX inte¬ 
grated accounting package (code 
named “Larghissimo Ma Non 
Troppo”, which is the last re¬ 
maining musical idiom unregis¬ 
tered as a software product) is but 
a few tweaks from perfection, and 
the Giants have a homestand 
coming up against the Dodgers. 

When reading of plant closures 
and staff layoffs, it is tempting to 
proclaim that the Last Days of 
Tribulation are at hand: “And 
whosoever was not found written 
in the book of life was cast into 
the lake of fire” (Revelation 
20:15), where “book of life” is 
interpreted as a company annual 
report containing a healthy bot¬ 
tom line. 

I, though, prefer the word “hic¬ 
cup” to “shakeout”. This reflects 
my own calm view that what we 
are suffering is not the final 
judgment, nor even a chronic 
sickness. Rather, I maintain that 



we are being hypochondriac over 
the spurious spikes and annoying 
discontinuities starting to appear 
on sales and profits graphs. Most 
of these graphic anomalies, I say, 
are entirely due to the unendear¬ 
ing quirks of the Macintosh Image 
Writer. If you step back far 
enough (to the rear of the Welfare 
line, for example) and half-close 
your eyes, the trend curves be¬ 
come smoother and less ominous, 
the fuzzy pie-charts assume a 
more edible disposition, and, 
hopefully, those ghastly Macfont 
legends disappear altogether. 

It is good to see two of the 
leading mainframe manufactur¬ 
ers, Sperry and Burroughs, re¬ 
sponding to this by closing in on a 
merger that might reverse their 
ailing fortunes. It will not be easy, 
though, to forge a unified product 
line from two such disparate 


ranges. Indeed, outside of Apple, 
it would be difficult to find a pair 
of systems of such daunting 
incompatibility. 

In the widely debated case of 
the Rise and Fall of Home Com¬ 
puting, there is no doubt that 
even Invincible Business Ma¬ 
chines has been disappointed. A 
closer inspection reveals that the 
root of the problem lies in unjusti¬ 
fied expectations and crazy fore¬ 
casting. If you predict a 400 
percent growth, and crank up 
production accordingly, then a 
200 percent growth, miraculous 
by any normal standards, be¬ 
comes abject failure. The abso¬ 
lute number of home computers 
sold is a monument to marketing 
ingenuity and human gullability. 

I offer one recent snippet to 
back this view. Having replaced a 
$300 typewriter with a $2000 
Word Processor (“The advertise¬ 
ment said it was ‘affordable’ so 
how could I resist?”), the home 
computerist is next offered a $95 
software package that allows di¬ 
rect keyboard-to-printer mode! 
“Bypass all those time-consum¬ 
ing diskettes! Forget all those 
funny filenames!” 

The home computer peddlers 
have also overlooked the growing 
number of homeless persons (ex¬ 
cluding those with Ph.Ds in Com¬ 
puter Science) who, more than 
any other segment of the market, 
are in need of affordable, system- 
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ware development tools to professional programmers work¬ 
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Our CCA EMACS™ editor environment is fast becoming 
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Read on . . . 


UNIX, VAX/VMS, CCA EMACS and Safe C are trademarks respectively of Bell Laboratories, Digital Equip¬ 
ment Corporation, Computer Corporation of America, and Catalytix Corporation. 


U DEVIL'S ADVOCATE 


atic, integrated, press-any-key- 
when-ready general problem- 
solvers. 

Pessimists, especially laid-off 
pessimists facing eviction and 
vehicular repossession, may well 
quibble with my carefree analy¬ 
sis. Unemployment, it seems, 
breeds a nasty form of intolerant 
cynicism that inhibits any ration¬ 
al assessment of reality. Indeed, I 
meet many who blame Reason 
itself for their plight. “My job was 
secure until they rationalized 
production”, is a common com¬ 
plaint in Santa Clara County. 

Ironically, the very work-free 
people whose leisure to ponder 
objectively on the deep structure 
of the cosmos would be the envy 
of any Golden Age Athenian, 


Unemployment, it 
seems, breeds a nasty 
form of intolerant 
cynicism. 


waste their time casting stones at 
the blameless. 

Little did the semiconductor 
assembly line workers realize 
that, as predicted by Ezekiel, 
Marx, and Engels, they had been 
“tilling their own graves through 
the seven years of abundance; 
their lamps were left un-oiled, 


yea, they trimmed not the wicks 
thereon.” They have literally 
automated themselves out of a 
job, and should expect no sympa¬ 
thy from the millions who were 
earlier victims of automation in 
other trades. An industry that 
was founded on the proposition 
that machines can legally work 
for less than minimum rate must 
accept the logic of using comput¬ 
erized computer assembly and 
the ALO (Automatic LayOff) pack¬ 
age, which sends termination 
notices and W2s by electronic 
mail. 

Nowadays, no self-respecting 
chip wants to be manhandled 
into this world—or even photo¬ 
graphed alongside germ-ridden 
human fingernails. Letting the 
chips procreate and assemble in 
their own unsullied environment 
will certainly improve the yield 
and lower the unit cost. Naturally, 
there also will be cycles of glut 
and shortage—but eventually 
the brighter chips will adjust 
(they can hardly do worse than 
the semiconductor industry). 

Programmers should be aware 
of similar suicidal trends in auto¬ 
matic software generation! Rash 
attempts to simplify awk are just 
the thin end of a dreadful wedge 
that could lead to major layoffs. 
An unemployed programmer is a 
pitiful sight. I have seen a few in 
San Jose. They gather round 
Automatic Teller Machines, idly 
tapping dead keyboards and 
dreaming of past glories. 


Liverpool-born Stan Kelly-Boo- 
tle has been computing, on and off, 
at most levels since the pioneering 
EDS AC I days in the early 1950s at 
Cambridge University. After gradu¬ 
ating from there in Pure Mathemat¬ 
ics, he gained the world's first post¬ 
graduate diploma in Computer 
Science. He has authored “The 
Devil's DP Dictionary" and co¬ 
authored “Lem Yerself Scouse" and 
“The MC68000 Software Primer". ■ 
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The Concurrent Euclid language 
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UNIX/11, VMS/VAX, with code 
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on IBM-PC 
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CCA EMACS™...The Most Powerful Editor 
Environment Available for UNIX™ and VAX/VMS™ 


“Programming with CCA EMACS, I can look at two 
or more files at once in different windows and then 
move text between them" 

“POWER WINDOWS” are only part of the reason so 
many programmers are using CCA EMACS to make 
program editing and system development easier and 
faster. 

Unprecedented power, speed, functionality, extensi¬ 
bility, and consistency across systems and on any ter¬ 
minal are others. Nearly 400 built-in commands let 
you do any job with a few keystrokes. And with our 
Common Lisp-based extension language, Elisp™, you 
can customize CCA EMACS to meet all your specific 
program needs. 

CCA EMACS has two extensive recovery facilities, 
and is supported by a full online documentation pack¬ 
age designed for beginners and experts alike. 

This complete kit of editing tools runs under Berkeley 
UNIX (4.1 and 4.2 BSD), Bell UNIX (Sys. Ill and V), and 
VAX/VMS. Binary prices range from $380 to $850 for 
UNIX to $1900 for VMS. 


Uniworks, Inc. 

7? A Crowntek Company 

Productivity Tools for Programmers 
20 William Street • Wellesley, MA 02181 


FREE TELEPHONE TRIAL. Call into our system for 
a tutorial review and actual product trial for CCA 
EMACS or SAFE C™. 

For more information, telephone trial instructions, or 
to place an order, phone our customer representa¬ 
tives toll-free at: 

800 - 222-0214 

in MA 617-235-2600, or mail this form. 

VISA and MASTERCARD phone orders accepted 


Please send me information on: 

□ CCA EMACS □ The Safe C Development Tools 

□ AI Development Tools □ Your complete line of state-of- 
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File and record locking 

by Bill Tuthill 


As it stands today, UNIX is not 
the best operating system for 
running databases. It’s true that 
UNIX itself is reliable enough, the 
file system is robust enough, and 
that at least 4.2BSD is fast 
enough. But VAX/VMS, for one, 
provides two features that UNIX 
does not: system-level facilities 
for indexing files, and standard¬ 
ized file and record locking. Both 
of these features are critical to 
database applications. 

The lack of the first feature— 
indexed files—does not pose a 
huge problem. Because struc¬ 
tured access techniques like ISAM (indexed sequen¬ 
tial access method) and B-trees (balanced binary 
trees) are easy to implement on the user level, 
database packages can have indexed files without 
system help. It was UNIX, in fact, that proved that 
operating systems need not impose specific record 
structures on files. IBM/CMS, an operating system 
mired in useless notions of file structure, is an 
example of how it used to be. 

So far, so good for UNIX. But then comes the 
second problem. File and record locking—unlike 
indexed files—is difficult to implement on the user 
level. The most primitive locking method is to create 
a temporary file that acts as a lock. This is 
inelegant, inefficient, and insecure. Without inter¬ 
process communication, the best alternative is to 
create a lock device driver and configure it into the 
kernel. As we know, though, UNIX has no standard 
technique for interprocess communication. 

The bottom line is that UNIX oilers no standard 
means for file and record locking. This is not to say 
that no standard exists: /usr/group, in fact, has had 
a locking standard for several years. But, of the 
common versions of UNIX, only 4.2BSD and Xenix 



provide file locking. Xenix alone 
provides mandatory record lock¬ 
ing. Many independent UNIX 
vendors who don’t deliver Xenix 
(like Convergent, Fortune, Onyx, 
Plexus, and Zilog) have imple¬ 
mented the /usr/group standard, 
but they have sometimes done so 
in subtly incompatible ways. 

System programmers often 
recognize the importance of file 
locking. They know, for instance, 
that mail spool files should be 
locked during a mail reading 
session (so that mail isn’t deliv¬ 
ered unexpectedly after the spool 
file is changed). But programmers outside the 
database community generally don’t recognize the 
importance of record locking. UNIX kernel hackers 
are often hostile to the very notion. 

Consider an airline reservation system. While 
you are booking a seat on a particular flight, the rec¬ 
ord for seating on that flight must be locked so that 
somebody else doesn’t book the same seat as you do. 
Locking the entire database file, though, would be 
an uneconomical measure, for it would preclude 
other agents from booking different flights at the 
same time. Clearly, a multiuser database system 
must be able to lock records (or regions) within a file. 

If UNIX is ever to be successful at running serious 
database systems, it must have system primitives 
for file and record locking. This article traces the 
locking facilities available on various versions of 
UNIX. 

THE /usr/group STANDARD 

In the spring of 1981, John Bass published a 
paper in the Usenix newsletter ;login:, that detailed 
the interface and implementation of a lockingO 
system call for file and record locking. The proposal 
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called for mandatory locks. About a year later, 
/usr/group published a standard that included a 
lockfO system call that had the same parameters 
and locking modes. The difference was that the 
/usr/group standard allowed for both mandatory 
and advisory locks. Advisory locks may be circum¬ 
vented by programs not using lockfO , while manda¬ 
tory locks cannot. Files with the setgid bit set are 
subject to mandatory locking under this standard. 

Mandatory locks are probably not necessary, 
though. Both OS/360 and VAX/VMS have survived 
for years with advisory locks only, and many large 
databases, including airline reservation systems, 
have been implemented on these operating systems. 
Furthermore, mandatory locks are a potential 
security problem. Some user program could lock 
/ etc/passwd , for instance, and then go to sleep, 
causing the entire protection subsystem to hang. 

The proposed standard lockfO system call allows 
a process to lock sections of a file. Other processes 
that attempt to lock that section will either block 
until the section becomes unlocked or return an 
error value. All locks on a file are removed once the 
file is closed, and all locks for a process are removed 
when the process terminates. The lockfO call looks 
like this: 

lockf(fd. mode, length) 

int fd. mode: 

long length; 

The file descriptor fd must come from a success- 
ful openl). creatO. pipe(). or dup() system call. The 
mode may be F_LOCK to lock a region for exclusive 
use. F_TEST to test for other locks, F—TLOCK to 
both test and lock, and F-ULOCK to unlock a 
region. Actually F-TLOCK is a non-blocking lock: if 
a region is locked, it will return an error, rather than 
sleep. The third parameter, length , specifies the 
number of bytes to lock, measured from the current 
position in the file. This can. of course, be changed 
with the lseek() system call. Negative values 
indicate how far back from the current position to 
lock. Even locks past the end of file are possible if 
one wishes to protect against appending. If locked 
regions overlap, they are combined into a single 
region. 

The potential for deadlock occurs if a process 
controlling a locked resource accesses another 
resource locked by a different process and is thus 
put to sleep. Because of this, calls to lockfO, readO, 
and writef) scan for a deadlock before sleeping on a 
locked resource. An error is returned if sleeping on a 
locked resource would cause a deadlock. A sleep on 
a resource can be interrupted with any signal. Thus, 


the alarmO system call may be used to provide a 
timeout facility if necessary. 

Record locking for a simple ISAM database is 
relatively straightforward: lock the data, and lock 
the index pointer for the data. But in a B-tree 
database, record locking is much harder. Modifying 
a record may require shuffling the leaves of the tree. 
The safest and easiest thing is to lock the entire B- 
tree, but this may not be acceptable in highly 
sophisticated applications. 

SYSTEM V CONSIDERED LACKING 

System V has no mandatory file or record locking 
features. System V Release 2 has twice as many— 
none. To my knowledge, no UNIX system delivered 


The most interesting part of the 
new AT&T standard is that locking 
can be controlled with the fcntlfJ 
system call. 


by AT&T has mandatory file or record locking. 

ICurrent VAX and 3B2 releases of System V.2 do 
have advisory file locking, however. A 3B2 
release of V.2 scheduledfor later this year will in¬ 
clude mandatory record locking. — Editor / 

The System V Interface Definition includes the 
/usr/group locking standard, for advisory locks only. 
Mandatory locking may or may not be included in 
future specifications. The programming usage is the 
same as in the /usr/group standard, as are the 
modes F_LOCK. F—TEST, F—TLOCK, and 
F-ULOCK. 

The most interesting part of the new AT&T 
standard is that locking can be controlled with the 
fcntlO system call. This affords a distinction 
between read locks and write locks, something not 
present in the /usr/group standard. The file descrip¬ 
tor passed to lockfO must have O—WRONLY or 
O-RDWR permission in order to establish a lock. 

Locks may also be established with the F_SETLK 
or F-SETLKW command to fcntlO; the distinction is 
that F-SETLKW waits, whereas F_SETLK is non- 
blocking. Either command takes the arguments 
F_RDLCK and F-WRLCK to lock, and F-UNLCK to 
unlock. A read lock (F_RDLCK) prevents other 
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processes from write-locking the protected area. 
More than one read lock may exist for a given region 
at any given time. The file descriptor in question 
must have been opened with read access. A write 
lock (F_WRLCK) prevents any process from read¬ 
locking or write-locking the protected area. Only one 
write lock may exist for a given segment of a file at 
any one time. The file descriptor in question must 
have been opened with write access. 

In a production database system, it is best to 
place a read lock on a record during browsing. If the 
record gets modified, the read lock can be upgraded 
to a write lock (if no other read locks exist), the rec¬ 
ord can be quickly updated, and the write lock can 
be removed. The system must arbitrate race 
conditions, as when two processes are both waiting 
for a write lock. 

FILE LOCKING ON BERKELEY UNIX 

Recent releases of Berkeley UNIX (4.2 and the 
forthcoming 4.3) contain the Jlock() system call, 
which allows processes to place advisory locks on 
files. Since advisory locks are not enforced by the 
operating system, Jlock() is useful primarily for 
cooperating processes that have already agreed 
upon a locking protocol. When a process attempts to 
lock a file already locked by another process, Jlock() 
blocks until the first process releases the lock. If 
called with the LOCK_NB (noblock) option, 
however, Jlock() will simply return the error 
EWOULDBLOCK rather than block when it encoun¬ 
ters a file that is already locked. Both exclusive and 
shared locks are available. At any one time, a file 
may have only one exclusive lock, but multiple 
shared locks are permitted. This call establishes an 
exclusive lock, and will block until the lock in effect 
is released: 

if ftlock(fd. LOCK—EX) < 0) 
perror("fatal error: flock"); 

The following call, on the other hand, establishes 
a shared lock. It will block if the file has an exclusive 
lock, but not if there are multiple shared locks: 

if (flockffd. LOCK_EX) < 0) 

perror("fatal error: flock"); 

Another option is represented by the following 
call, which establishes an exclusive, non-blocking 
lock. It can be used when one can try again later, 
rather than wait for a lock to be released: 

if (flock(fd. LOCK_EXiLOCK—NB) < 0) 

perror("try again later: flock"); 


Any of the above locks can be released by this 
call: 

(void)flock(fd. LOCKUJN): 

In all of these examples, the file descriptor fd is 
obtained from a system call such as opeti(). The 
Jlock() system call returns -1 if the file descriptor is 
invalid, or if it does not refer to a file. 

Some Berkeley UNIX commands that establish 
locks are tip (when writing a log of the call), dump 
(when recording information about incremental 


John Bass should be commended 
for his work on file and record 
locking. 

dumps), and some versions of mail (for locking the 
spool file during mail browsing). 

The main problem with the flockQ facility is that 
it lacks record locking. Database applications could 
perform record locking by using a socket-based lock 
manager, but this would be slow compared to record 
locking done by the kernel. 

FILE AND RECORD LOCKING ON XENIX 

Xenix release 2.0 includes the lockingl) system 
call for performing both file and record locking. This 
facility is similar to the original 1981 lockingf) 
proposal by John Bass, except that it provides read 
locks in addition to read/write locks. As in the 
original proposal, all locks are mandatory rather 
than advisory. 

The programmer supplies as parameters the file 
descriptor, the locking request, and the number of 
bytes to lock: 

locking(fd. mode, length) 
int fd. mode; 
long length: 

The file descriptor is obtained from a successful 
system call such as opert(). The number of bytes to 
lock is specified from the current position in the file, 
which can of course be changed with the lseek() 
system call. The available modes are LK—LOCK 
to lock a region, LK_NBLCK to lock a region with¬ 
out blocking, LK_RLCK to read-lock a region, 
LK—NBRLCK to read-lock a region without blocking, 
and LK_UNLCK to remove any of the above locks. 
Both LK_LOCK and LK—RLCK wait until the lock is 
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available: LK_NBLCK and LK—NBRLCK return an 
error instead. 

Portions of a file may be locked against both 
reading and writing, or just against writing. Pro¬ 
cesses that attempt to read or write a file region 
locked against reading and writing by another 
process (using LK_LOCK or LK_NBLCK mode) will 
sleep until that region has been released. Processes 
that attempt to write to a file region locked against 
writing by another process (using LK—RLCK or 
LK—NBRLCK mode) will sleep until that region has 
been released. 

It was helpful for Microsoft to implement read- 
locks, but the facility that Xenix provides conforms 
neither with the /usr/group standard, nor with the 
System V interface definition. This may change in 
Xenix 5.0, however. 

CONCLUSION 

John Bass should be commended for his work on 
file and record locking. Without him, there would be 
no standard today. His public-domain locking 
facility has not only been published as part of the 


/usr/group standard, but has been included by 
various far-sighted vendors. Even some people 
within AT&T have finally seen the light, and have 
included locking primitives in the System V 
Interface Definition , as well as an intelligent 
interface with fcntl(). 

As time goes by, I believe Bass’ locking standard 
will become widely accepted. The 4.2BSD locking 
facility will die out simply because it does not 
provide adequate functionality (it has no record 
locking). Perhaps in a few years, programmers will 
be able to write database systems with the certain 
knowledge that file and record locking will be 
available on all UNIX systems. Until then, we will 
have to limp along with a standard that is only 
partially standard. 


Bill Tuthill was a leading UNIX and C consultant at 
UC Berkeley for four years prior to becoming a member 
of the technical staff at Sun Microsystems. He enjoys a 
solid reputation in the UNIX community earned as 
part of the Berkeley team that enhanced Version 7 (4.0, 
4.1 , and 4.2BSD). ■ 
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THE UNIX 
GLOSSARY 


Database terminology 

by Steve Rosenthal 


Note: Only those meanings 
applicable to databases and 
UNIX have been included in 
this listing. 

activity ratio—the proportion of 
a file that has been read, updated, 
or written during a given period. 
The optimal method of storing a 
large amount of data can vary in 
accordance with the expected ac¬ 
tivity ratio. 

add—to introduce a new record 
into a database. In some systems, 
additions are classified as ap¬ 
pends (at the end of the file) or as 
inserts (within the file). The addi¬ 
tion of data to a field in an existing 
record is known as an update. 

ad hoc—said of queries to a 
database management system 
(DBMS) made directly by a user 
rather than as part of a program. 
Most of the more sophisticated 
DBMS packages for UNIX support 
an ad hoc query facility that 
typically uses a format resem¬ 
bling IBM’s SQL (Structured Que¬ 
ry Language). This feature allows 
non-programmers to get quick 
answers to simple questions. 

aggregate—to derive a value, 
such as a count or sum, based on 
the contents of many records. 

applications development sys¬ 
tem—a software package intend¬ 
ed to help with the development of 



applications programs, generally 
by using descriptions of needed 
inputs, outputs, and their rela¬ 
tions. This saves the user from 
having to specify each step, as 
would be necessary with proce¬ 
dural languages. Some applica¬ 
tions development systems for 
UNIX function as complete user 
shells, providing a user interface 
that substitutes for the normal 
UNIX environment. Compared to 
program generators, applications 
development systems are gener¬ 
ally more sophisticated, but usu¬ 
ally produce code that requires a 
more extensive runtime support 
package. Applications develop¬ 
ment systems are often called 
“application generators”. 

atomic—an indivisible opera¬ 
tion, that must either be run to 


completion or aborted altogether. 
Transaction updating is a typical 
atomic operation. 

attached processor—a supple¬ 
mentary processing unit used to 
speed up the processing of spe¬ 
cialized types of data. For exam¬ 
ple, arithmetic calculations or 
database searches can be facili¬ 
tated by the use of an attached 
processor. When the processor is 
a chip or a board, the term co¬ 
processor is commonly used. 

attribute—an item of informa¬ 
tion entered into a single field (or 
a “single cell” in a row-and- 
column database). In relational 
databases, columns are often re¬ 
ferred to as “attributes”. 

audit trail—the recording of 
each update, addition, or deletion 
of records such that a database 
can be reconstructed later by 
referring to logs. 

browser—a program or mode 
allowing the user to look through 
a database on screen without 
using a set of procedural display 
commands. Most browsers also 
allow users to move through data 
using simple keystrokes to indi¬ 
cate direction. Many will let users 
change data by overwriting old 
data. 

B-tree—shortened term for “bal¬ 
anced tree”, a way of organizing 
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pointers to information in data¬ 
bases that allows quick retrieval 
of any single specified record. So- 
called “B + ” or “B*” trees allow 
records to be efficiently retrieved 
in sequential order. Many data¬ 
bases designed for UNIX use B + 
trees for their indices, and some 
arrange their records using this 
kind of structure in preference to 
the UNIX file system. 

cardinality —in formal des¬ 
criptions of databases, “cardinal¬ 
ity” refers to the number of tuples 
in a table or set. Translated into 
everyday language, this means 
the number of records in a data¬ 
base or the number of rows in a 
table. 

CODASYL —an acronym (pro¬ 
nounced Code-a-sill) for Confer¬ 
ence on Data Systems and Lan¬ 
guages, a computer industry 
organization set up by the US 
Department of Defense in the late 
1950s. The most famous product 
of the group is the language 
COBOL, but it also developed a set 
of standards for database struc¬ 
tures that have been used with 
other languages as well. 

column —in databases that use a 
table-like organization, a “col¬ 
umn” is a record component that 
contains similar information in 
each record (and thus is repre¬ 
sented in a single column when 
the data is displayed as a table). In 
more traditional database termi¬ 
nology, “field” is an analogous 
term. 

concurrency control —in a dis¬ 
tributed system, “concurrency 
control” is used to ensure that 
simultaneously input events do 
not interfere with each other or 
lead to the processing of incom¬ 
plete records and files. The usual 
method for exercising this control 
is for the system to finish the 
processing of one transaction be¬ 
fore allowing another user to 
access the same record or file. 


database —in the most general 
sense, “database” refers to any 
clearly identified collection of 
data. Some people differentiate 
between a “data base” (two 
words), meaning an underlying 
collection of data in the real 
world, and a “database” (single 
word), meaning a coherent collec¬ 
tion of data stored in a computer 
system. When taken as the latter, 
the word makes particular refer¬ 
ence to data organized so that 
various programs can access and 
update it. 

database management systems 
(DBMS) —a program or set of 
programs providing a framework 
for creating, editing, and main¬ 
taining collections of data for use 
by different programs. DBMS sys¬ 
tems serve as an interface be¬ 
tween programs and data and 
may also include a query facility 
for making individual (ad hoc) 
requests for information from the 
database and a reporting facility 
for producing formatted listings 
of selected data. 

data dictionary —the “data dic¬ 
tionary”, in most complex data¬ 
bases, is a list of defined fields 
and record formats. It is used as a 
guide or constraint to ensure that 
all programs using the database 
treat the data consistently. 

decompose —to change a request 
couched in non-procedural lan¬ 
guage to a required procedural 
form. This is one of the principal 
tasks of an ad hoc query system. 

delete —to logically remove infor¬ 
mation (usually a record) from a 
database. Often, deleted data is 
marked but not physically re¬ 
moved until the file is copied or 
consolidated. 

field —in data entry, a “field” is 
an area in which a certain type of 
information is to be placed. For 
example, a database program 
might reserve a 10-character field 


for area code and telephone num¬ 
ber. Conceptually, fields are simi¬ 
lar to the blanks to be filled in on 
a pre-printed form. When shown 
on a screen, fields are generally 
marked by flags indicating a 
beginning and an end, by tempo¬ 
rary fill-in characters (such as 
periods), or by differences in color 
or brightness. 

file —a group of records treated 
as an overall unit by the operating 
system. In some types of data¬ 
bases, each file is made up 
of structurally identical records, 
while other types allow variations 
in a single file. 

get —to retrieve records from a 
database, or at least to mark them 
for further processing. A get 
operation is often followed by a 
definition of the desired group of 
information. 

hash —to make a pointer or index 
by applying a transformation to 
the characters or values compris¬ 
ing a key or record. Hashing 
provides a very fast way of index¬ 
ing large lists or databases, but it 
requires complex programming to 
deal with collisions (when the 
hash function produces identical 
results for different input values) 
and to fold long keys into short 
hash values. Many UNIX utilities 
use hash functions to create 
pointers to their internal tables. 

hierarchical —a model for orga¬ 
nizing data that uses “owner¬ 
ship” as its basic conceptual unit. 
Each item “belongs” to a higher 
item, and is accessed through 
that higher item. This model is 
used on most CODASYL-type da¬ 
tabases, including most of those 
written in the COBOL language. 
The UNIX file system can also 
be thought of as a hierarchical 
database. 

indexed sequential access meth¬ 
od —a method of organizing files 
that is most popular on older 
magnetic tape-based systems. It 
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depends on keeping the file in 
overall order, but creates overflow 
areas and indices for changes 
that don’t fit in place. Periodical¬ 
ly, the entire file must be reorga¬ 
nized. Some of the older UNIX 
utilities use ISAM, but most use a 
direct access method better suit¬ 
ed to disk storage. 

insert —to add records to a file, 
particularly in a mode that allows 
additions to the middle of a file (as 
opposed to an append operation, 
which only allows additions at 
the end of a file). 

ISAM —an acronym (pronounced 
“eye-sam”) for Indexed Sequen¬ 
tial Access Method. This arrange¬ 
ment of data was pioneered by 
IBM when magnetic tape was the 
principal means of storing data, 


but it’s only used rarely now by 
small systems that have disks. In 
effect, ISAM keeps data in a basic 
order, making an exception list of 
out-of-order items as data is add¬ 
ed, deleted, or changed. Periodi¬ 
cally, the file must be cleaned and 
rewritten to eliminate the excep¬ 
tion list. 

join —in general, to combine two 
databases (or when used as a 
noun, “join” refers to the result of 
that combination). In particular, 
as applied to relational data¬ 
bases, “join” means the creation 
of a new file containing all the 
records of a second file referred to 
by yet another file. One example 
would be a shipping list file 
produced by joining an orders file 
with an inventory file. 


key —the part of a record that 
will be used as a identifier when 
records are indexed or sorted. For 
example, in the telephone book, 
the key is each subscriber’s 
name. Some databases allow du¬ 
plicate keys (where the key is the 
same for two or more records), but 
other systems require that each 
key be unique. 

log —a list of transaction records 
entered since a certain check¬ 
point. Good practice calls for each 
transaction to be logged before 
requesting an update or oper¬ 
ation. With a log, data can be 
reconstructed following a system 
failure by using records dating 
from an appropriate checkpoint 
to resubmit all transactions. 

modify —to change the content 
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of a database or the structure of a 
record or file. All databases for 
UNIX support modification of da¬ 
tabase content, but only some 
will allow record structures to be 
modified once data has been 
entered. 

non-procedural —said of sys¬ 
tems and languages where users 
specify what they want done 
instead of how to carry it out. For 
example, in a database, a non¬ 
procedural query might ask for all 
records with values between cer¬ 
tain limits, while the equivalent 
procedural statements would ac¬ 
tually specify how to sort and 
select the database to find those 
records. 

project— to make a selection 
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of possible fields from database 
records for further processing. 
The fields may be selected from 
all records in the database or 
from a more restricted set. 


QBE —short for “Query By Ex¬ 
ample”, a non-procedural meth¬ 
od of specifying record selection. 
See query by example for details. 


query by example —to find or 
select records in a database by 
specifying acceptable ranges of 
values in a sample record instead 
of programming the steps needed 
to make that selection. This fea¬ 
ture, which allows database que¬ 
ries to be made by those who 
are not expert programmers, 
has been steadily moving down 
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from large mainframe systems to 
even modest-sized UNIX database 
management systems. 

query language— a computer 
language that acts as an interface 
between user and database to 
facilitate the retrieval of informa¬ 
tion. Most query languages are 
procedural, based on a set of 
commands (often called verbs) 
and qualifiers. The trend, howev¬ 
er, is toward “natural language” 
interfaces that allow queries 
posed in forms more akin to 
ordinary speech. 

record —a collection of entries in 
a database filed under a single 
key or identifier. Files are com¬ 
posed of records, which in turn 
are composed of fields. Each rec¬ 
ord usually represents a single 
instance of the type of informa¬ 
tion collected in the database, 
such as all information related to 
a single part number or the wages 
of a single employee. In the rela¬ 
tional database model, a record is 
equivalent to a single row in a 
table. UNIX itself treats files as 
streams of characters, so the 
database program must handle 
the task of grouping information 
into logical records. 

record-locking —the exclusion 
of users from accessing (or, some¬ 
times, just writing to) a record 
that another user is already up¬ 
dating. This prevents the corrup¬ 
tion of data that might occur if 
each user could make changes 
without taking stock of changes 
made to the same data by other 
users. Record-locking affects only 
those records in use, allowing 
other users to access other parts 
of the file. AT&T has added rec¬ 
ord-locking to System V Release 
2, but previous versions of UNIX 
lacked a standard record-locking 
call. 

relation— a table (row-and-col- 
umn structure) with attributes 
forming the columns, and tuples 
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(records) forming the rows. Each 
relation represents a linking of 
data values with the attributes or 
fields in which they fall. Note that 
usage is moving towards calling a 
combination of two tables a “rela¬ 
tion”. In formal terminology, this 
would be known as a “join”. 

relational —in the strict sense, 
“relational” refers to databases 
that are conceptually organized 
in a row-and-column format, 
with the data defined as the 
relation of a part of a record (the 
row) to a category or field (the 
column). In recent use, however, 
the term has been taken to refer 
to a database that can join or 
display two or more files based on 
a shared field or category. 

remove —to take values or 
records out of a database. Often, 
“removal” implies actual phys¬ 
ical erasure or reorganization, 
while “deletion” may signify 
that material was only logically 
eliminated or marked for later 
removal. 

replace —to change values in a 
database record or field to new 
ones. Often, replace implies the 
complete change of an entire 
record or field, while a modify or 
update operation may change 
only a portion. 

report —when referring to data¬ 
bases and their use, “report” 
refers to a collection of data 
gathered, formatted, and output 
according to user request. Getting 
a formatted, focused report out of 
a database is often much more 
difficult than collecting the data 
in the first place. 

report writer —a program or sub¬ 
system that produces reports 
from a database. Most early re¬ 
port writers required users to 
specify selection and formatting 
steps in extensive detail, but the 
trend in many recent packages 
has been to offer a menu-driven 
interface. 


restrict —to select certain re¬ 
cords from a database, either for 
display or further processing. 

restructure —to change the orga¬ 
nization of a database, particular¬ 
ly the layout or makeup of fields 
in each record. Only some data¬ 
bases allow restructuring once 
data has been entered. 

retrieve —to get records from 
a database. Some people use 
“retrieve” as a synonym for the 
general operation of reading re¬ 
cords back from storage, while for 
others it implies a selection of 
certain records. 

row —in the relational database 
model, the values that print out 
on a horizontal line when a file 
(relation) is shown in table form. 
This includes all the values in the 
table associated with a single key, 
and is also termed a “tuple” or 
“record”. 

select —in a sort or transfer oper¬ 
ation, “select” refers to efforts to 
isolate all the records meeting a 
specified criteria. A select always 
implies restriction of the set, 
and sometimes it also implies 
retrieval. 

SQL —short for Structured Query 
Language, an approach to ex¬ 
tracting information from data¬ 
bases pioneered by IBM. 


table —a file or relation logic¬ 
ally organized in row-and-column 
form. Each record (row or tuple) is 
identical in format, which makes 
for easier processing and joining 
with other tables. 

transaction —an update, addi¬ 
tion or deletion of a record in a 
database. Transactions are usu¬ 
ally treated as atomic (they either 
must be completed or backed out 
completely). Systems designed for 
transactions are optimized to 
handle many changes. 

tuple —the more formal name for 
a row in a relation (table). It is a 
shortened form of “n-tuple” (tak¬ 
en from the realm of mathe¬ 
matics) that is used to refer to 
multiple attributes (fields or col¬ 
umns) associated with each row. 

update —to change information 
in a database without resorting to 
removing records, making new 
ones, or creating copies. Updates 
can be done in real-time (transac¬ 
tion-based), or in batches (batch 
mode). 

Comments, questions, or cor¬ 
rections? Please send them to 
Rosenthal's UNIX Glossary, Box 
9291, Berkeley, CA 94709 . 


Steve Rosenthal is a lexicogra¬ 
pher and writer whose work appears 
regularly in six personal computer 
magazines. m 
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RECENT 

RELEASES 


PLEXUS FILLS OUT LINE 
WITH P/20 

In an effort to offer a low-end 
microcomputer to system integra¬ 
tors and customers with several 
end users, Plexus Computers, 
Inc., has introduced the P/20. 
Part of the P/15 family, the P/20 is 
said to offer more networking 
options than the P/15, and yet 
offer a lower price and compara¬ 
ble performance to the P/35. This 
new model will compete with 
such machines as the SCR Mini- 
taur and AT&T 3B2. 

The base package for the P/20 
features a 24 MB hard disk, a 1 
MB floppy, .5 MB of memory, and 
accommodates eight users. This 
package operates under a Plexus 
port of UNIX System V, Release 2; 
the basic offering contains the 
utilities needed to get the system 
going: an editor, uucp, cu, and 
tape and backup facilities, ac¬ 
cording to Lynn Macey, Plexus’ 
National Analyst Manager. He 
also said the P/20 is fully licensed 
for a complete port if a VAR 
wishes to install it. The machine 
incorporates dual MC68010 mi¬ 
croprocessors that run at 10 MHz 
with no wait states, and SCSI and 
Multibus interfaces. The Multi¬ 
bus permits the addition of a 
controller board to provide up to 
16 serial ports. The base package 
is priced at $10,950. A fully 
outfitted P/20, with 152 MB of 
hard disk and 2 MB of floppy, 
carries a price of $20,300. 

Plexus Computers, Inc., 3833 
N. First St., San Jose, CA 95134, 
408/943-2248. 
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Plexus Computers adds to its 
model line with the P/20. 


ALTOS: DO I HEAR 20? 

DO I HEAR 30? 

For those who don’t live in 
a Spanish-influenced region of 
the country, “Altos” translated 
means “tall”, or “the tall ones”. 
The San Jose-based computer 
firm bearing the name may have 
had this in mind when plotting its 
new marketing strategy, for it 
appears to be setting its sights 
high. Altos is intensifying its 
marketing efforts around two 
new products, the 2086 and 
much-anticipated 3068. 

The Altos 2086 supermicro is a 
high-end addition to the com¬ 
pany’s Intel CPU-based product 
family. Rather than compete with 
the 586 and 986, which support 
up to five and nine users respec¬ 
tively, the 2086 is sold by Altos as 
a complement or upgrade to these 
machines, as it supports up to 20 


users (hence the model name). 
The price reflects this upgrade 
($19,990 for the base configura¬ 
tion, compared to $7990 for the 
586 and $ 11,990 for the 986), but 
the 2086’s specs are worthy 
of consideration: based on a 16/ 
32-bit Intel 80286 running at 8 
MHz, the box comes with 2 MB of 
RAM, an 80 MB hard disk, a 1.2 
MB floppy, a 60 MB streaming 
tape unit, and an Altos III termi¬ 
nal. Hard disks can be upgraded 
to 189 MB (formatted) in 63 
MB increments. The 2086 runs 
Xenix 3.0. 

The Altos 3068 has the same 
modular design characteristics 
and many of the same compo¬ 
nents as the 2086, but it is a 
notably distinct machine—for 
two reasons. First, the 2086 was 
designed as a dealer product, to be 
marketed through Altos distribu¬ 
tors, though Altos hopes for inter¬ 
est from major accounts as well; 
the 3068, on the other hand, is 
designed as an OEM product. 

The second distinction is espe¬ 
cially noteworthy: the 3068 is a 
mass-produced supermicro based 
on the 32-bit MC68020 micro¬ 
processor. Announced last March 
and available as of last month, 
the 3068 comes with a base 
package featuring 1 MB of RAM, a 
20 MB hard disk, and a 1.2 MB 
floppy supporting up to 10 users. 
When properly outfitted, howev¬ 
er, this machine will support up 
to 30 users. The design includes 
eight board slots with four avail¬ 
able for custom configuration. 
RAM can be expanded to 16 MB in 
1, 2, or 4 MB increments; hard 
disks can be upgraded to 240 MB 










(unformatted) in 20, 60, or 80 MB 
increments. 

Other available features in¬ 
clude a streaming magnetic tape 
unit, with up to 60 MB of backup 
storage; an operating system (the 
3068 runs System V) supporting 
demand-paged virtual memory 
with 1 K page size; and various 
software products, including the 
Altos Office Manager (which in¬ 
cludes a windowing package) and 
a database management package 
based on Unify. 

Altos emphasizes that while 
the 3068 capably functions as a 
standalone system, it can be 
joined with other Altos multiuser 
systems via the Altos WorkNet 
local-area network, and with 
mainframes via 3270 Bisynch, 
SNA, X.25, and 3780 communi¬ 
cations options. With Altos PC 
Path attached to WorkNet, the 


3068 can also act as a file server 
and communications gateway for 
personal computer users. 

Philon, Inc., of New York City, 
has been selected as one of the 
suppliers of compilers for the 
3068. BASIC-C, BASIC-M, CO¬ 
BOL, and C .compilers are pres¬ 
ently available, and Fortran, Pas¬ 
cal, and RPG compilers will be 
ready later this year. 

The base configuration price 
for the Altos 3068 is approxi¬ 
mately $7000 in OEM quantities. 

Altos Computer Systems, 2641 
Orchard Pkwy., San Jose, CA 
95134, 408/946-6700. 
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SWEET TALK THROUGH 
THE (RCA) MAIL 

SofTest, Inc., has signed a 
contract with RCA Service Corpo¬ 


ration for SofTest’s new com¬ 
munication product. Sweet Talk. 
RCA will be using Sweet Talk to 
access the RCA Mail network and 
provide access to RCA Mail for its 
customers. 

Sweet Talk is a UNIX-based 
product that can link the user 
with on-line database services 
and remote computers, as well as 
electronic bulletin boards and 
mailing services. It claims to 
bring to UNIX all of the features of 
various popular MS-DOS prod¬ 
ucts such as CROSSTALK and 
SmartCom, as well as some other 
benefits not available to PC users; 
Sweet Talk is also compatible 
with these two products. Since it 
is UNIX-based, Sweet Talk can be 
shared and used concurrently by 
several people on the same com¬ 
puter. It is now available on Altos 
computers and Radio Shack 16Bs 


ACUITY® business software 
is compatible with any budget, 
and all these systems: 


AT&T 3B’s 
Motorola 

Charles River Data 
Sun Microsystems 
All Unix based micros 
All Unix “look-alikes" 


Plexus 

Convergent 

Cromemco 

Altos 

Harris/VOS 

VAX/Ultrix 


Gould 

Sperry 

Momentum 

Dual 

Harris/Unix 

VAX/VMS 


Serving general accounting, wholesale, distribution, 
manufacturing and project/job costing applications on 
over 30 different machines, ACUITY allows you to 
select from individual modules to build a fully inte¬ 
grated software system specifically for your needs. 

Accounts Payable » Accounts Receivable 
General Ledger • Fixed Assets • Payroll 
Customer Order Processing • Inventory 
Purchasing/Receiving • Project Management 
MRP • Master Scheduling • BOMP 
Project Scheduling • Labor Projections 
Work Breakdown Structure 

For more detailed information, call 619/474-6745. 

coGniTion 

225 West 30th Street, National City, California 92050 

Circle No. 37 on Inquiry Card 


57 VU 555"" 


Version 3d0 Available Now! 

The Reliable High Performance APL 
for UNIX* Systems 

Dyalog APL is fast! 

Version 3.0 is up to 10 times faster than previous versions! 


Dyalog APL is functional! 

Nested Arrays 
Full Screen Editor 
Full Screen Data Manager 
Event Trapping 

Interface to all UNIX* Facilities 
Optional Graphics 


Dylalog APL is reliable! 

Dyalog APL has been in commercial use for over two years 
and is available NOW for most UNIX* Systems so call or 
write today. 


MIPS Software Development, INC 

31555 W. 14 Mile Rd. #104 


Farmington Hills, MI 48018 
(313) 855-3552 


' Improvtmenu ire a function of »y»tem and uta^e 
• UNIX i* a trademark of AT&T Bell Laboratory 
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RECENT RELEASES 


and 6000s, and SofTest plans to 
port Sweet Talk to the AT&T and 
IBM lines of UNIX machines, 
among others. Single copies of 
Sweet Talk sell for $300. 

SofTest, Inc., 555 Goffle Rd., 
Ridgewood, NJ 07450, 201/447- 
3901. 

Circle No. 39 on Inquiry Card 

ANALOG RUNNING IN 
APOLLO'S DOMAIN 

Analog Design Tools, Inc., has 
announced it will develop a ver¬ 
sion of its analog circuit design 
software to run on all of Apollo 
Computer’s DOMAIN 32-bit fam¬ 
ily of engineering workstations. 
Originally designed for the Sun 
workstation, Analog’s computer- 
aided engineering software also 
has been produced in a recently- 
released version for Daisy Sys¬ 
tems, as well as others to be 
announced later. 

Under the terms of a Software 
Supplier Agreement, Analog will 
provide Apollo with training in 
the operation of its Analog Work¬ 
bench software, along with sup¬ 
port and assistance for demon¬ 
strating the software on Apollo 
workstations. Apollo in turn will 
provide Analog with technical 
support and will inform Apollo 
customers of the availability of 
Analog’s software for its systems. 

Analog Design Tools, Inc., 800 
Menlo Ave., Suite 200, Menlo 
Park, CA 94025, 415/328-0780. 
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FORTRAN LIBRARY FOR 
HP 9000 

As evidence of the continuing 
penetration of UNIX into the sci¬ 
entific/engineering market, wit¬ 
ness the now-available collection 
of over 500 Fortran subroutines 
that can be run on Hewlett- 
Packard Series 9000 computers. 
The IMSL Library contains tested 
programs for a range of mathe¬ 
matical and statistical applica¬ 


tions that can be selected by a 
programmer rather than devel¬ 
oped from scratch. The Library 
has been available for some time 
on HP 1000 and 3000 Series 
machines, and is now compatible 
with the 9000 Series (Models 520, 
530, and 540) running a Fortran 
77 compiler under HP’s version of 
UNIX, HP-UX. 

An annual supported license is 
priced at $1200 for the initial 
year, and is renewable at $1000. 
IMSL also offers a reduced price to 
educational institutions and dis¬ 
counts on multiple purchases. 

IMSL Sales Division, The NBC 
Building, 7500 Bellaire Blvd., 
Houston, TX 77036, 1/800/222- 
IMSL; in Texas, 713/772-1927. 
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ALIS: THROUGH THE 
UTEK-GLASS 

A contractual agreement has 
been reached between Applix, 
Inc., and Tektronix, Inc., whereby 
Tektronix will market Alis, the 
Applix office software system. 
Alis will be run under UTEK, the 
Tektronix version of UNIX, on the 
Tektronix 6000 family of 32-bit 
workstations. Applix will port 
Alis to the Tektronix 6130 proces¬ 
sor and provide support for the 
6000 family and 4107 terminals. 

UTEK is a 4.2BSD-based port 
of UNIX, combining 4.2 with 
aspects of System V, Release 2. It 
also offers Tektronix enhance¬ 
ments, according to Tektronix 
Product Support Manager Bruce 
Harris, including a virtual mem¬ 
ory system faster than 4.2’s, and 
an added distributed file system. 
The operating system is support¬ 
able on workstations and comes 
with packaged learning sessions. 
Though it is “large UNIX’’, Tek¬ 
tronix claims the advantage of 
compatibility with both 4.2 and 
V.2. 

Applications of the Alis system 
include multifont word process¬ 


ing, drawing, spreadsheet, busi¬ 
ness graphics, electronic mail, 
and network-based information 
sharing. The package works 
at combining graphics-based 
and integrated PC applications 
with communications-based of¬ 
fice automation systems. Applix 
has previously announced agree¬ 
ments with Convergent Technol¬ 
ogies and Computer Sciences 
Corporation. 

Applix, Inc., 112 Turnpike Rd., 
Westboro, MA 01581, 617/870- 
0300. 
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IN-HOUSE PUBLISHING 
FROM ETP 

ETP Systems, Inc., has pro¬ 
duced an in-house publishing 
system providing laser printer 
typography for UNIX computer 
users. A turnkey system that 
includes an Imagen 300 dot/inch, 
8 page/minute (12 and 24 page/ 
minute also available), Canon- 
powered laser printer, is distin¬ 
guished from other publishing 
systems by ETP-developed “usr/ 
tools’’ software. This software 
package contains an enhanced 
device independent troff (with a 
focus on menu-driven processing 
of troff commands), several mac¬ 
ro formatting packages, a laser 
printer driver, a choice of up to 33 
fonts, and a font magnification 
program (with a capacity for siz¬ 
ing typefaces from 6 to 72 points). 

ETP Systems markets its pub¬ 
lishing package through com¬ 
puter manufacturers and OEMs, 
and provides a product support 
package that includes descriptive 
data sheets, sample output, and a 
manual set. The in-house system, 
with three fonts, is approximately 
$13,000; additional fonts are 
available at $175 per face. 

ETP Systems, 10150 SW Nim¬ 
bus Ave., Suite E-2, Portland, OR 
97223, 503/639-4024. 
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Only Sperry can make the 
following four statements. 

Our PC runs the XENIX™ 
system, as well as MS-DOS™. 

Our 4 new microcomputers 
run the UNIX system. 

Our new minicomputer runs 
the UNIX system. 

Our Series 1100 mainframes 
run the UNIX system. 

All of which means there is 
a great deal we can do for you. 


For instance, our family of 
computers based on UNIX 
systems has incredible trans¬ 
portability for all your software. 

And being able to accom¬ 
modate from two to hundreds 
of users, it’s impossible to out¬ 
grow our hardware. 

Of course, this linking of all 
your computer systems can add 
measurably to your productivity. 

And a fast way to find out 


more is to get a copy of our 
Sperry Information kit. For 
yours, or to arrange a demon¬ 
stration at one of our 
Productivity Centers, call 
1-800-547-8362. 

‘UNIX is a trademark of AT&T Bell Laboratories 
XENIX and MS-DOS are trademarks of Microsoft 
Corporation 

©Sperry Corporation 1985. 



Introducing an idea 
that makes obsolescence obsolete. 



The UNIX operating system 
from PC to mainframe. 
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DATABASE OVERVIEW 



DATABASE OVERVIEW 

Continued from Page 33 

or exists simultaneously in two accounts. Transac¬ 
tions can hide this anomaly from other users, and 
can ensure that if the system crashes at this point, 
the database will be restored to a consistent state. 
This requires a more complex commit operation, 
and deadlocks become even more common and 
more difficult to resolve (since many objects—not 
just one—may have been updated). 

Audit Trails. It has been said that lawyers and 
bookkeepers will inherit the earth. To assuage our 
future owners, databases containing information of 
legal significance should include the ability to 
maintain audit trails—logs of all changes (and 
possibly accesses) made to a database. 

If care is taken in designing the log, audit trails 
(sometimes also called transaction logs) can be 
used to “roll back” a database to a previous state, 
that is, to undo changes that have been made. This 
can be necessary for recovery from system crashes 
or deadlocks. 

Backup/Recovery. Databases should be dumped 
periodically. A database dump is equivalent to a 
“level zero” dump on UNIX. In addition, a dump of 
the transaction log (see above) can be considered 
equivalent to an “incremental dump” on UNIX; a log 
can be used to roll forward a database and redo the 
incremental changes made since the last database 
dump. 

An important issue is whether the database can 
be backed up while it is live (available for users), or if 
it must be in a quiescent state. This is important be¬ 
cause some applications simply cannot afford to be 
offline for the several hours it can take to back up a 
large database. 

UNIX-style dumps are not normally acceptable. 
For example, if one record in a 10 million-record file 
changes, a UNIX incremental dump will save the 
entire file, while a clever database backup will save 
only the changed record. 

Systems supporting very large databases should 
allow dumps of subsections of the database. 

Protection. Often it is necessary to restrict 
access to data. For example, “managers can read 
the salaries of the people who work for them; the 
personnel department can read all salaries; all 
other access to salaries is denied”. This can be 
relatively low resolution (perhaps access could be 
limited on a per-file basis—as on UNIX) or very high 
resolution (whereby individual fields and/or individ¬ 
ual records are protected). 

Data Dictionary. If you have a particularly 
complex database structure, you may want to have a 
data dictionary available. This feature allows you to 
ask for information about the data itself. For 


example, you might need to know the attributes 
from different relations that can be correlated 
against one another. 

Integrity Constraints. Some systems give you 
the ability to place additional constraints on the 
data. For example, one could specify that “salaries 
must be positive” or “every employee must be in a 
department”. 

Non-Traditional Data Types. Typically, data¬ 
base systems have concentrated on fairly ordinary 
data, such as integers, character strings, and the 
like. As the use of databases expands, they are being 
extended to handle new types, such as text, 
graphics, and “experts” (a time expert, for example, 
would understand “yesterday”, “three weeks from 
last Tuesday”, and other time and date-oriented 
constructs). 

As superminicomputers have steadily become 
more available, the availability of comparably sized 
database management systems also has increased. 
These systems are often relational , which is to say 
they operate on data in simple tables rather than in 
less flexible data structures, such as hierarchies or 
networks. 

All database management systems offer certain 
basic operations: retrieval, insertion, deletion, and 
modification. Larger systems may also provide 
aggregation and the ability to correlate tables 
against one another. 

User interfaces represent the most obvious 
differences between database management systems 
today. Many kinds of interfaces are available, 
ranging from complex programmer interfaces to 
simple facilities that can be used by relative 
novices. 

Internally, database management systems may 
have many important features. Since these are 
usually less obvious than the differences between 
interfaces, special care should be taken to evaluate 
them carefully. Every feature has a cost, but failure 
to include an important feature can make the 
difference between a pleasant interaction with data 
and a sentence to live out a technological 
nightmare. 


Eric Allman has spent many years working on almost 
all aspects of database management systems. Currently , 
he works at Britton Lee developing interfaces to 
database machines. He previously spent several years at 
UC Berkeley , working on the development of the 
INGRES system. Between database responsibilities he 
has worked on a variety of other projects , including text 
preparation , electronic mail , and computer games. ■ 

Material presented in this article was first presented at the Spring 
1984 European UNIX Users’ Group Conference. UNIX REVIEW 
expresses its gratitude to the EUUG. 
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SETTING THE STANDARD FOR TOMORROW 


■■■■ 
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RUBIX™ is a high performance database manage¬ 
ment system (DBMS) for the entire range of com¬ 
puters—from single user microcomputers to large 
mainframes. It offers the ideal solution to the micro¬ 
mainframe compatibility issue. RUBIX is a true rela¬ 
tional DBMS which is a fully integrated part of UNIX, 
not an afterthought. The friendly English-like user 
language and relational programming language (Q) 
allow complex Applications to be developed in a frac- 
associated with conventional devel- 
nd where desired, the developer can 
st complete Host Language DBMS 
Interface now available. Extensions to the relational 
model, including dated relations and updated views, 
he most powerful DBMS available 
today. RUBIX ife the only DBMS under UNIX which 
supports simultaneous access to multiple databases 
ions across databases. And, on net¬ 
work computeris, the databases can even reside on 
different machines. 


Data manipulati 

■ The RUBIX i 

■ Shell scripts 
UNIX comm 

■ Queries pro 
preter 


on may be performed by: 
interactive relation editors 
containing interactive RUBIX and 
inds 

cessed interpretively by the Q inter- 


■ Compiled queries linked with C functions 

■ C programs invoking RUBIX macros and functions 

PREFIX™ was designed to provide nonprogrammers 
a convenient and easy-to-use interface to RUBIX. 
Through PREFIX, even nonprogrammers can gener¬ 
ate applications software for online transaction pro¬ 
cessing environments. 

PREFIX features include: 

■ Interactive full-screen forms generation facility 

■ User-friendly menu interface 

■ Online context sensitive help 

■ Comprehensive data entry, editing and validation 

■ Automatic retrieval from other relations and data¬ 
bases 

■ Multiple search facilities 

■ C language interface 


RUBIX/PREFIX offers an efficient method for creat¬ 
ing applications tailored to specific needs. The result¬ 
ing systems are quickly learned and reliably operated 
by clerical personnel with little or no computer 
background. 


When productivity is the key, the choice is RUBIX. 



INFOSYSTEMS TECHNOLOGY, INC. 

6301 Ivy Lane / Greenbelt, MD 20770 / (301) 345-7800 
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MAKING A MATCH 



Use the Power 
of Your Computer 

. . . to automatically look up city, 
state and county information based 
on zip code. Table of 48,000 zips al¬ 
lows significant savings on data 
entry, error corrections and file 
maintenance. This set of floppy 
disks, including easy instructions, 
is just $149. Most popular 5'A" and 
8" formats are available. Hard disk 
required. Call or write for free infor¬ 
mation. 

DCC Data Service 

1990 M Street, N.W., Suite 610 
Washington, D.C. 20036 

202-452-1419 
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Tree Shell 

A Graphic Visual 
Shell for Unix/ 

Xenix End-Users and 


Experts Alike! 



Dealer inquiries welcomed. 

M 



"A Higher Form of Software" 

24000 Telegraph Road 
Southfield, Ml 48034 
(313) 352-2345 

TELEX: 386581 COGITATE USA 
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Continued from Page 41 

piping data between front-end 
and backend may argue against 
the two-process architecture 
on small machines. Computers 
that accommodate many users, 
though, can benefit in several 
ways from the front-end/backend 
approach. 

First of all, UNIX runs all its 
programs as re-entrant code, 
which is to say that even when 
two or more users are running the 
same program, the code segment 
is still only loaded once. However, 
each user gets a private data 
segment. Thus, when all users on 
a system run the same query 
language, the code only occupies 
physical memory once. However, 
if each user runs different data¬ 
base programs, code segments for 
each program will have to be 
loaded. Chances are, though, that 
80 percent of the code is redun¬ 
dant across all the segments 
since each contains the same 
DBMS retrieval code. Thus, if a 
single backend process were han¬ 
dling all calls from user pro¬ 
cesses, the code for these differ¬ 
ent programs could be shared to a 
large extent. 

A problem with the two-pro¬ 
cess architecture, though, is that 
it makes it impossible for a pro¬ 
cess to talk to several other 
processes. Strictly speaking, each 
of the backends looks like a 
separate process to UNIX; they all 
simply see to it that the system 
only loads code once. Several 
different data segments still exist 
for a backend process, which 
really only wants one—meaning 
that buffered data needs to be 
stored redundantly. This is bad 
both because it is inefficient and 
because it can cause “concur¬ 
rency control” problems of its 
own. 

The secret to turning this prob¬ 
lem into an advantage is the use 
of shared memory (standardized 


in System V). Shared memory 
allows a backend process to look 
like several processes to the UNIX 
process table, but to still “share” 
data areas with other invocations 
of itself. In essence, the backend 
process becomes one beast with 
several concurrent pipes to re¬ 
questing processes. This is the 
architecture required for DBMS 
efficiency. The more users there 
are, the more important this ar¬ 
chitecture becomes. Pipe over¬ 
head thus is more than paid for. 

CONCLUSION 

Throughout the history of 
DBMS products on UNIX, the 
advent of standards has always 
helped. This has been particular¬ 
ly true in the case of locking and 
shared memory. Other interest¬ 
ing standardization efforts are 
also going to have a major impact 
on UNIX. For example, European 
UNIX suppliers have developed a 
group called X/OPEN to standard¬ 
ize on a subset of the System V 
Interface Definition and make a 
number of “commercial exten¬ 
sions”, such as C-ISAM indexed 
file manipulation calls. 

It is interesting that the com¬ 
mercial extensions have focused 
on typical data processing needs. 
This trend places UNIX and 
DBMS directly at the heart 
of commercial data processing. 
UNIX clearly has gone through a 
tough evolution, but it has adap¬ 
ted well and finally come of age. 


Roger J. Sippl is President of 
Relational Database Systems, Inc. 
(soon to move from Palo Alto to 
Menlo Park, CA). As such, he has 
helped pioneer the UNIX DBMS 
market with such products as Infor¬ 
mix and the recently released Infor- 
mix-SQL, which includes embedded 
SQL interfaces for C and COBOL. 
Mr. Sippl holds a degree in Com¬ 
puter Science from UC Berkeley, 
and is a founder and former board 
member of /usr/group. ■ 




















Connect your IBM, Apple, Tandy, 

Zenith, A.T.&T., Hewlett-Packard, 
Televideo, NCR, IMS, SUN, or other 
DOS or UNIX-based system to 
another micro or to your mainframe 
with CLEO Software. 

Now you can connect your PC LAN, too! 

For details call: 1(800) 233-CLEO 
In Illinois 1(815) 397-8110 


CLEO and 3780Plus are registered trademarks of CLEO Software. 

IBM is a registered trademark of International Business Machines Corporation: Apple is a registered trademark 
of Apple Computer: UNIX Is a registered trademark of A.T.&T. Technologies, Inc. 


Circle No. 55 on Inquiry Card 


CLEO Software 

a division of Phone 1, Inc. 
1639 North Alpine Road 
Rockford, IL 61107 
TELEX 703639 













WEINBERGER INTERVIEW 



WEINBERGER 

Continued from Page 49 

WEINBERGER: Either in the ker¬ 
nel or with interprocess com¬ 
munication, depending on how 
fast you want to go. There’s more 
to the United Airlines system 
than that. The United Airlines 
system has, oh, 10,000 terminals 
or so. A lot of them are what 
are called multidrop terminals. 
They’re all connected by way of 
Bisync, or HDLC, or something, 
and they look a bit like 3270s. 

Even if we had a database 
management system we were 
happy with, one of the things I 
don’t have is the knowledge need¬ 
ed to deal with these communica¬ 
tion networks. That’s a really 
important part of a database 
system. It’s partly a reliability 
thing. You don’t know that your 
transaction is complete until you 
get a message saying so. That 
means you have to log the mes¬ 
sages. You have to control every¬ 
thing. There is no point in doing 
something or, more precisely, fail¬ 
ing to do something if it is not 
recorded. If you enter something, 
but the line goes away, you may 
deserve to know whether or not 
that transaction was completed. 

REVIEW: It may even be worth 
money. 

WEINBERGER: That’s conceiv¬ 
able. 

REVIEW: Do you think that's a 
communications issue? 

WEINBERGER: No, in principle it 
has to be handled wherever the 
database is. I have two separate 
objects that have to be backed 
up—to be logged. I have the 
database itself, and I have com¬ 
munications messages. To get 
both halves of that consistent, 
there’s got to be some controlling 
entity that handles both. So the 
logical control for both has to be 
in one place. It doesn’t have to be 


centralized in any sense, but 
logically, there can be only one 
coordinating entity that handles 
the sending of messages and the 
assigning of tasks. 

REVIEW: That sounds hard to 
do with an add-on package. 

WEINBERGER: It’s certainly dif¬ 
ficult to do right with an add-on 
package. You could probably 
come very close. But, yes, it is 
very hard to do. When my trans¬ 
action runs, it does two things: it 
writes a disk block, and it sends a 
message to the terminal. If you’re 
not careful, you can send them in 
the wrong order, and then what¬ 
ever wasn’t sent first may not 
happen if there’s a failure. 

REVIEW: Is the logging oj com¬ 
munication needed so that if 
a line is dropped. you can give 
the person the message again 
whenever they come back? 

WEINBERGER: Yes. You may de¬ 
sire to know that a message never 
got through. The other thing you 
have to watch out for in the 
database literature is that it deals 
mostly with transaction process¬ 
ing stuff. At least that’s where the 
interesting theoretical questions 
typically have been. Another set 
of questions, though, considers 
what you can do if you do not have 
transactions and you’re not re¬ 
trieving little records. These 
questions surface when you con¬ 
sider relational retrieval systems. 
What if my objects are programs 
or pictures? 

Pictures can be small, but 
maps are big. Of course, records 
can also be very large. What if the 
database is enormous? The cen¬ 
sus is an example of an enormous 
database. You probably have a 
billion bytes or more of data. I 
don’t think anybody’s very good 
at taking data from the census 
and simply putting it into a 
database system—not without 
writing a lot of special code first. 


What’s more, there are only cer¬ 
tain questions you can ask eco¬ 
nomically for a very large data¬ 
base. If I have a 10,000-byte 
database or whatever—you can 
look at every character and think 
hard. But if you have a giant 
database, that’s just not possible. 
That’s a problem with databases 
that exists no matter what sys¬ 
tem you implement your work on. 
As your view of the world 
changes—and as your custom¬ 
ers’ view of the world changes— 
your database will get weirder 
and weirder. One of the argu¬ 
ments in favor of doing things in a 
relational way is that you can do it 
more easily than you would be 
able to with a CODASYL model, 
one of those arrangements like a 
hierarchical database that’s been 
wired into concrete. 

There are also many philo¬ 
sophical questions—take the Bell 
Lab phone book database, for 
instance. Each employee has a 
unique payroll account number, 
so there is no way you can ask the 
database system reliably to see if 
somebody is listed in your data¬ 
base twice if the records have 
different keys, the system be¬ 
lieves the people are different. 
That’s because they can't be in 
the database twice. 

One of the things that happens 
when people start talking about 
networks is that they have this 
idea that you want to be able to 
find things in the network. People 
tend to get carried away with the 
notion of using a database system 
as a name server that will allow 
them to look up nicknames and 
stuff. The fundamental problem 
is the difficulty in distinguishing 
among all the people named “Da¬ 
vid S. Johnson’’. You’re probably 
going to have a lot of them. But 
you don't want to find all the 
people named Dave at Bell Labs 
during your search for just one of 
them. 

REVIEW: You would get mostly 
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WEINBERGER INTERVIEW 


useless information? 

WEINBERGER: Well, I do not 
know if it’s useless information, 
but it’s pretty hard, given a list of 
full real names, to disambiguate 
people. Get out the phone book to 
convince yourself of that. Rela¬ 
tional databases can be trouble¬ 
some—even as ideal objects. 

REVIEW: What about the rela¬ 
tional model attracts you? 

WEINBERGER: I find the rela¬ 
tional model fairly simple to un¬ 
derstand. The hierarchical model 
sounds like the UNIX system 
directory tree, but it’s not; it’s 
nothing at all like the UNIX 
directory. All the layers are of 
different types. 

The point is to keep things just 
as simple as possible—perhaps 
even a little simpler so as to better 
restrain you from the tendency to 
make things complicated. 

REVIEW: The question of UNIX 
suitability for databases then 
comes down to whether it's 
unsuitable or not. Considering 
that it's being used for data¬ 
base solutions today , it's clearly 
not unsuitable. It's just that the 
database solutions aren't being 
built into UNIX. 

WEINBERGER: That’s right. 

REVIEW: So you don't need to 
abandon UNIX whenever you 
face a problem that requires a 
database. 

WEINBERGER: Right. There’s a 
graduate student at Princeton 
who is building a file system that 
you can think of as a database file 
system. When you read and write 
using it, locking is handled for 
you automatically. The system 
knows about transactions, and it 
has a UNIX way of dealing with 
these things. A lot of the conven¬ 
tional trappings of transaction 
processing simply aren’t there. 
It’s all handled for you whether 
you ask for it or not. 
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REVIEW: How is he implement¬ 
ing it? 

WEINBERGER: It looks like a 
piece of a disk, but there’s a hook 
built in that notices when reads 
and writes occur, and handles 
locking and logging automatic¬ 
ally. He can tell what each pro¬ 
cess is doing since he’s in the 
kernel. The scheme does not use 
/usr/group standard record lock¬ 
ing, but the record locking works 
just the same. It’s all done secret¬ 
ly for you, or to you, depending 
on how you’ve asked for it. The 
goal was not to implement a 
database system under UNIX. It 
was to get fairly realistic mea¬ 
surements comparing different 
forms of concurrency control and 
crash recovery. 

The concurrency information 
is essentially the core of the 
research. There’s a lot of talk 
about that. I think the consensus 
is that the most efficient means 
for doing concurrency control and 
crash recovery happens to be the 
way that big commercial systems 
use rather than all the other 
fancy techniques people have 
thought up. Not everybody agrees, 
of course. 

REVIEW: Do you believe that 
there's only one solution? 

WEINBERGER: I don’t think that 
there’s only one solution, but I 
think at the moment there’s a 
clear-cut first choice if you’re 
trying to build a fairly high perfor¬ 
mance database system. You use 
locking instead of optimistic con¬ 
currency control, and you use 
logging instead of shadow paging 
or whatever the other alterna¬ 
tives are. 

REVIEW: Doesn't shadow pag¬ 
ing buy you a bit in terms of 
recovery? 

WEINBERGER: Yes, shadow pag¬ 
ing makes a lot of aspects of 
recovery easier. I had a package 
that did some kind of shadow 


paging in index trees, and we 
never had to roll back a transac¬ 
tion. It was wonderful. Until you 
committed new pages, they were 
invisible. The file got bigger, but 
you couldn’t see the new pages 
logically. The problem is that you 
have to write whole pages to do 
that, and if your transactions are 
small, the log records will be 
small and won’t take up whole 
pages. Then, if you’ve carefully 
got your file organized so that you 
get several disk blocks per revolu¬ 
tion but your shadow pages end 
up getting placed someplace else, 
you’re going to use the heads a lot. 
That makes the scheme seem 
noticeably less efficient. But it’s 
not clear that’s always impor¬ 
tant. Shadow pages really do offer 
a lot of conveniences. 

But, for high performance it 
looks like locking and logging are 
going to win. Maybe it’s not a 
consensus; it’s probably contro¬ 
versial. But anyway, that’s the 
way it looks to me. 

REVIEW: Are you still working 
on databases? 

WEINBERGER: No. I ran out of 
things I wanted to do. I’m still 
prepared to speculate about data¬ 
bases, though. 

REVIEW: Do you have any 
speculations in mind? 

WEINBERGER: You’ve heard a 
lot of them already, put forward 
as facts. We haven’t really men¬ 
tioned networking much, though. 
We’ve talked mostly about ques¬ 
tions related to distributed data¬ 
bases. Of course, there’s a lot of 
research being done on distribut¬ 
ed database systems, but it’s 
never been quite clear that the 
notion itself is really a good idea. 
Distributed programs are hard to 
understand. But you always have 
to face up to the question of 
networking when you talk about 
database systems. That really 
complicates things. ■ 
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dynamic growth from where we are today to where we may be 
in the future. 

The exhibit area will feature booths from many of the leading 
suppliers of OS-9 compatible hardware and software. It's a great 
opportunity to increase your skill and knowledge in the latest 
microcomputer software technology. Plan to attend — Register 
Today! 


Seminar only $150 Hotel Package* $350 
Location Marriott Hotel, Des Moines, IA 
Don’t Miss It — Pre-Register Now! 

Call 515-224-1929 or Write 


MICROWARE SYSTEMS CORPORATION 
1866 N.W. 114th St. • Des Moines, IA 50322 

nictowaM>- 


*Hotel package includes 3 nights, single occupancy at the Marriott Hotel 
and registration fee. 

OS-9 and BASIC09 are trademarks of Microware and Motorola 








U CALENDAR 


“UNIX Fundamentals”. Contact: Claire Donahue, 2401 .Colora¬ 
do Ave., 3rd floor, Santa Monica, CA 90404. 213/453-8649. 
September 9-12 LUCID, New York, NY: ‘‘UNIX System 
Administration”. Contact: Alice Moss, 260 Fifth Ave., Suite 
901, New York, NY 10001. 212/807-9444. 

September 9-13 Bunker Ramo Information Systems, Trum¬ 
bull, CT: “Advanced UNIX”. Contact: Bunker Ramo, Trumbull 
Industrial Park, Trumbull, CT 06611. 203/386-2223. 
September 9-13 Computer Technology Group. Chicago and Los 
Angeles: “C Language Programming”. Contact: Computer 
Technology Group, 310 S. Michigan Ave., Chicago, IL 60604. 
800/323-UNIX. 

September 9-20 Information Technology Development Corpo¬ 
ration, Cincinnati: “UNIX for Application Developers”. Contact: 
ITDC, 9952 Pebbleknoll Dr., Cincinnati, OH 45247. 513/741- 
8968. 

September 10-12 Bunker Ramo Information Systems, Trum¬ 
bull, CT: “Diagnostic UNIX”. Contact: Bunker Ramo, Trumbull 
Industrial Park, Trumbull, CT 06611. 203/386-2223. 
September 10-12 Computer Technology Group, Boston and 
Washington, DC: “UNIX Administration”. Contact: Computer 
Technology Group, 310 S. Michigan Ave., Chicago, IL 60604. 
800/323-UNIX. 

September 10-13 Integrated Computer Systems, Los Angeles 
and Washington, DC: “UNIX: A Comprehensive Introduction”. 
Contact: ICS, 45405, Los Angeles, CA 90045. 213/417-8888. 

■- 1 

! UNIX*-C COURSES 

THE UNIX SYSTEM FOR END USERS 

Sept 9-13 • Oct 7-11 • Oct 28-Nov 1 

THE UNIX SYSTEM FOR THE DP 
PROFESSIONAL 

Sept 16-20 • Sept 30-Oct 4 
Nov 4-8 • Dec 9-13 

C LANGUAGE PROGRAMMING 

Sept 23-27 • Nov 18-22 • Dec 2-6 


HANDS-ON SESSIONS IN ALL COURSES 
COURSE FEE AS LOW AS $855 
DISCOUNTS FOR EARLY REGISTRATION 


I COURSE LOCATION I 

WASHINGTON, D.C. 
METROPOLITAN AREA 

| On-Site and customized courses also available. Write | 
| or call for course descriptions and registration | 
| information. | 

(301) 498-0722 

WEBCO INDUSTRIES, INC. 

14918 LAUREL OAKS LANE 
LAUREL, MARYLAND 20707 

J 'UNIX Is a trademark oI AT&T Bell Laboratories 
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September 12-13 Interactive Systems Corp., Santa Monica, 
CA: “Using the Shell”. Contact: Claire Donahue, 2401 Colorado 
Ave., 3rd floor, Santa Monica, CA 90404. 213/453-8649. 
September 16-17 Interactive Systems Corp., Santa Monica, 
CA: “System Administrator’s Overview”. Contact: Claire 
Donahue, 2401 Colorado Ave., 3rd floor, Santa Monica, CA 
90404. 213/453-8649. 

September 16-17 Computer Technology Group, Chicago and 
Los Angeles: “Shell Programming”. Contact: Computer Tech¬ 
nology Group, 310 S. Michigan Ave., Chicago, IL 60604. 800/ 
323-UNIX. 

September 16-17 Computer Technology Group, Boston and 
Washington, DC: "Advanced C Programming Workshop”. 
Contact: Computer Technology Group, 310 S. Michigan Ave., 
Chicago, IL 60604. 800/323-UNIX. 

September 16-18 CL Publications, Cambridge. MA: “C Techni¬ 
cal Seminar”. Contact: Carl Landau, CL Publications, 131 
Townsend St., San Francisco, CA 94107. 415/957-9353. 
September 16-19 AT&T Information Systems, Callaway 
Gardens, GA: “UNIX OS: The First Step”. Contact: AT&T 
Information Systems’ Institute for Communications and Infor¬ 
mation Management, PO Box 8, Pine Mountain, GA 31822- 
0008. 800/247-1212. 

September 17-18 Bunker Ramo Information Systems, Trum¬ 
bull. CT: “UNIX/C Applications”. Contact: Bunker Ramo, 
Trumbull Industrial Park, Trumbull, CT 06611.203/386-2223. 
September 17-20 LUCID, New York. NY: “Advanced C 
Programming”. Contact: Alice Moss, 260 Fifth Ave., Suite 901, 
New York. NY 10001. 212/807-9444. 

September 18-20 Computer Technology Group, Chicago and 
Los Angeles: “Using Advanced UNIX Commands”. Contact: 
Computer Technology Group, 310 S. Michigan Ave., Chicago, IL 
60604. 800/323-UNIX. 

September 18-20 Computer Technology Group, London: “UNIX 
Administration”. Contact: Computer Technology Group, 310 S. 
Michigan Ave., Chicago, IL 60604. 800/323-UNIX. 

September 18-20 Computer Technology Group, Boston and 
Washington, DC: “Advanced C Programming Under UNIX”. 
Contact: Computer Technology Group, 310 S. Michigan Ave., 
Chicago, IL 60604. 800/323-UNIX. 

September 18-20 Digital Equipment Corp., New York: “Com¬ 
prehensive Overview of the UNIX Operating System”. Contact: 
Digital Education Resources, 12 Crosby Drive, Bedford, MA 
01730. 617/276-4949. 

September 18-20 Interactive Systems Corp., Santa Monica, 
CA: “Interactive Networking Tools”. Contact: Claire Donahue, 
2401 Colorado Ave., 3rd floor, Santa Monica, CA 90404. 213/ 
453-8649. 

September 23-24 Computer Technology Group, London: 
“Advanced C Programming Workshop”. Contact: Computer 
Technology Group, 310 S. Michigan Ave., Chicago, IL 60604. 
800/323-UNIX. 

September 23-24 Productivity Products International, Raleigh, 
NC: “The Concepts of Object-Oriented Programming”. Contact: 
Barbara Dunn. Productivity Products Int’l, 27 Glen Rd., Sandy 
Hook. CT 06482. 203/426-1875. 

September 23-24 Interactive Systems Corp., Santa Monica, 
CA: “Advanced Commands for Programmers”. Contact: Claire 
Donahue, 2401 Colorado Ave., 3rd floor, Santa Monica, CA 
90404. 213/453-8649. 

September 23-27 Computer Technology Group, Chicago and 
Los Angeles: “UNIX Internals”. Contact: Computer Technology 
Group, 310 S. Michigan Ave., Chicago, IL 60604. 800/323- 
UNIX. 










"Ylatune CVn/aoaea Tew 
Reatnictiona on 7ho*e 
Daninq Enough to Lead’ 


ZIM is a fully integrated fourth 
generation application development 
system designed for leading system 
integrators and corporate and 
independent applications developers. 

"COMPLETE DEVELOPMENT 
ENVIRONMENT 
-Report Writer 
-Forms Painter and Manager 
-Data Dictionary 
-Application Generator 
-Non-procedural Programming 
Language 
-Compiler 

-C Language Interface 
-Runtime System 
*POST RELATIONAL 
-Entity Relationship Model 
-Powerful extension of Relational 
Model 

"MAINFRAME POWER, 
FUNCTIONALITY AND 
PERFORMANCE 
"APPLICATIONS PORTABILITY 
-MS-DOS, UNIX, XENIX, and QNX 
"MULTI-USER 

-Full transaction processing control 
"NETWORKING 

"APPLICATIONS LIMITED ONLY 
BY HARDWARE 
"BUILT-IN STRATEGY 
OPTIMIZER 

"ENGLISH-LIKE LANGUAGE 
"QUALITY PRODUCT SUPPORT 
ZIM is a mainframe system that runs 
on micro-computers and on super 
micro-computers. If you want 
mainframe power, speed, flexibility and 
freedom from arbitrary limitations all 
at a micro price, talk to us about an 
evaluation system. 

Dealer inquiries are welcome. 
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The Information Interface 
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1785 Woodward Dr., Ottawa, Ontario 
K2C0R1 (613)727-1397 

MS-DOS and XENIX .ire Microsoft Corp. tradema 
UNIX is an AT&T trademark. QNX is a Quantu 
Software Systems trademark. 
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Only Microware's OS-9 
Operating System Covers 
the Entire 68000 Spectrum 



Is complicated software and expensive hardware 
keeping you back from Unix? Look into OS-9, the 
operating system from Microware that gives 68000 systems 
a Unix-style environment with much less overhead and 
complexity. 

OS-9 is versatile, inexpensive, and delivers outstanding 
performance on any size system. The OS-9 executive is 
much smaller and far more ef¬ 
ficient than Unix because it's 
written in fast, compact as¬ 
sembly language, making it 
ideal for critical real-time ap¬ 
plications. OS-9 can run on 
a broad range of 8 to 32 bit 
systems based on the 68000 
or 6809 family MPUs from 
ROM-based industrial con¬ 
trollers up to large multiuser 
systems. 

OS-9'S OUTSTANDING 
C COMPILER IS 
YOUR BRIDGE TO UNIX 

Microwares C compiler tech¬ 
nology is another OS-9 advantage. The compiler produces 
extremely fast, compact, and ROMable code. You can easily 
develop and port system or application software back and 
forth to standard Unix systems. Cross-compiler versions for 


VAX and PDP-11 make coordinated Unix/OS-9 software 
development a pleasure. 

SUPPORT FOR MODULAR SOFTWARE 
- AN OS-9 EXCLUSIVE 

Comprehensive support for modular software puts OS-9 
a generation ahead of other operating systems. It multiplies 
programmer productivity and memory efficiency. Applica¬ 
tion software can be built 
from individually testable 
software modules including 
standard 'library" modules. 
The modular structure lets 
you customize and recon¬ 
figure OS-9 for specific hard¬ 
ware easily and quickly. 

A SYSTEM WITH 
A PROVEN 
TRACK RECORD 

Once an underground 
classic, OS-9 is now a solid 
hit. Since 1980 OS-9 has 
been ported to over a hun¬ 
dred 6809 and 68000 
systems under license to some of the biggest names in the 
business. OS-9 has been imbedded in numerous consumer, 
industrial, and OEM products, and is supported by many 
independent software suppliers. 


Key OS-9 Features At A Glance 

• Compact (16K) ROMable executive written in assembly 
language 

• User “shell” and complete utility set written in C 

• C-source code level compatibility with Unix 

• Full Multitasking/multiuser capabilities 

• Modular design - extremely easy to adapt, modify, or 
expand 

• Unix-type tree structured file system 

• Rugged “crash-proof” file structure with record locking 

• Works well with floppy disk or ROM-based systems 

• Uses hardware or software memory management 

• High performance C, Pascal, Basic and Cobol compilers 


OS-9 


MICROWARE SYSTEMS CORPORATION 
1866 NW 114th Street 
Des Moines, Iowa 50322 
Phone 515-224-1929 


Telex 910-520-2535 


Microware Japan, Ltd 
3-8-9 Baraki, Ichikawa City 
Chiba 272-01, Japan 
Phone 0473(28)4493 

Telex 299-3122 


OS-9 is a trademark of Microware and Motorola. Unix is a trademark of Bell Labs. 

























